I. Introduction
The Vilenkin-Chrestenson transform can be viewed as the generalization of the Walsh transform from binary to multiple-valued logic (MVL) functions [4], [20]. It has applications analogous to that of the Walsh transform in binary logic [9], [11]. In spite of the existence of fast algorithms, time needed for computing the Vilenkin-Chrestenson spectrum of a - valued function is a restrictive parameter in many applications. Therefore, accelerating the computation by using devices such as graphics processing units (GPUs) can be of practical importance. The existing algorithms are based on different factorizations of the Vilenkin-Chrestenson transform matrix which are tailored for the implementation on central processing units (CPUs). Mapping of the Vilenkin-Chrestenson transform to GPUs requires careful selection of a particular fast algorithm since different underlying factorizations have significant implications on the performance.