Templates for unique id generation for types : a beginning
After reading one of the comments on this blog post, I decided to publish here a proof of concept for the generation of unique ids for types.
Basically, it’s just about one template structure, using very basical templates techniques : specialization, metaprogramming-level recursion and nested enum for the computation. Either a familiarity with template metaprogramming or with functional programming is enough to understand the following code.
If you’re familiar with any scheme of functional recursion, or recursion in its mathematical sense, you’ll get it quite easily. Mathematically, we’re somehow defining a recursive function generate_id this way :
Which, in C++, becomes :
template <typename T>
struct id_generator;
template <typename T>
struct id_generator<T*>
{
enum { result = id_generator<T>::result*2 };
};
template <typename T>
struct id_generator<T&>
{
enum { result = id_generator<T>::result*3 };
};
template <typename T>
struct id_generator< std::vector<T> >
{
enum { result = id_generator<T>::result*11 };
};
template <>
struct id_generator<int>
{
enum { result = 5 };
};
template <>
struct id_generator<char>
{
enum { result = 7 };
};
And a minimal example of use :
int main()
{
std::cout << "int : " << id_generator< int >::result << std::endl
<< "char**** : " << id_generator< char**** >::result << std::endl
<< "vector<int> : " << id_generator< std::vector<int> >::result << std::endl;
return 0;
}
/* Output :
int : 5
char**** : 112
vector<int> : 55
*/
To make it really usable, you’ll have to add much (much much much) more :
- base types, like int and char here, which help for terminating the recursive call
- composition rules, like vector, references and pointers here, which help for propagating the recursion
To ensure the uniqueness here, I use prime numbers, simply [1]. Indeed, for the base types, I give a value being a prime number. For the different compositions, I multiply by a different prime number for each. However, most of you may have noticed that the id only lets you know about what type compositions and base types are used, but gives nothing about the order. If we add a rule for std::list, multiplying by 13 for example, then a list of vectors of ints will have the same id as a vector of lists of int. It’d need additional code and workarounds to take the order of the compositions in account — I guess it is possible though. If any reader here comes up with a solution, let us know. A possible solution would be to drop prime numbers, since they rely on a commutative (abelian) ring [2].
[1] http://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic
[2] http://en.wikipedia.org/wiki/Commutative_ring
Look at Boost.Typeof.
http://www.boost.org/doc/libs/1_39_0/doc/html/typeof.html
It uses something similar yet different so you can use it with your own types and not only primitive
I didn’t know (so far) boost.typeof has been using a similar mechanism. Thanks !
Anyway, this technique can work with our own types too.
It seems you assign the same ID to vector* and vector. Obviously, this encoding doesn’t work. You might want to approach the mapping a bit differently. You can think of the type as a tree. Encoding and decoding that tree as a sequence of “symbols” is straightforward. Mapping the sequence of symbols to integers is straightforward, too. I just don’t see the point in all this. What do you need IDs for?
I always forget about worpress eating angle brackets. What I ment to say was you get the same ID for vector<int*> and vector<int>*.
The tree is a nice idea.
I don’t have a particular need for that, it’s rather a little challenge.
I know I get the same ID for these two types — see the end of my post — but connecting such things with math-related stuffs is pretty interesting.
Hello Alp.
Here is my propposal for a unique type ID :
generateid(int) = 1
generateid(char) = 2
generateid(float) = 3
∀T, generateid(pointer to T) = 4 × generateid(T) + 0
∀T, generateid(ref to T) = 4 × generateid(T) + 1
∀T, generateid(vector of T) = 4 × generateid(T) + 2
∀T, generateid(list of T) = 4 × generateid(T) + 3
Indeed, I thought about such a solution since I wrote this blog post. The problem is that this doesn’t not scale for new containers, like, say, std::set. The “4″ and the offset have to scale when adding new “type compositions” AND new types.
Powerful post.