Re: How to get from A to B (actually, from type "A" to type "B")
On 1/6/2013 1:22 PM, Ramon F. Herrera wrote:
I had been using for a long time this (from Boost::Filesystem):
string somestring = "abc/de";
path p = path(somestring);
Only to realize, accidentally, that the conversion is done
automatically. The IDE should help you in those cases:
path p = somestring;
This is mainly an issue, in C++, of detecting whether or not an explicit
conversion is necessary. Note, however, that many style guides tend to
prefer explicit conversions over implicit ones.
This one made me kick myself. I used this many, many times:
const char* sometext = somestring.string().c_str();
Well, it turns out that this one is just as good:
const char* sometext = somestring.c_str();
My question is about R&D done in this particular field. I tried Google
but the word "type" is too ambiguous.
There was actually a project by Google using Clang that automatically
eliminated instances where std::string and const char* interconversion
was being unnecessarily performed. Note that this is a reason why
implicit conversion is frowned upon by style guides. :-)
This problem is very similar to the resolution of Rubik's Cube. Your
expression is in some "scrambled" state and you need the computer to
tell you -not only any path! mind you- but the shortest path (known as
God's algorithm) to the desired type.
No, the algorithm you're looking for is "BFS", specifically in a
directed graph, as taught in any introductory algorithms class and often
many more too.
The problem is not doing graph traversal, but actually computing the
graph in the first place: you are basically asking people to solve a
very hard AI problem of inferring intent, and this can be difficult even
for humans with very good documentation. Let's use your filesystem
example to motivate why it's hard.
Suppose you have a file class like so:
class File {
public String getAbsolutePath(); // Removes . and ..
public String getCanonicalPath(); // Resolve symlinks
public String getFilename();
public String getExtension();
public int getSize();
public int getPermissions(); // Unix-style octal permissions
public int getInodeNumber(); // Unix filesystem UID
public int open(); // Returns file descriptor number
}
What should you return if you want to query File -> String? A human
responder would probably say one of the first too, but there are times
to prefer one over the other (it depends on what you are doing!).
Automated analysis would have to either require the human to annotate
all the conversion methods or use heuristics to guess. The irony is that
getExtension() is probably the simplest method of the lot, so heuristics
based on implementation complexity will probably fail here.
Now suppose you queried File -> int. This, to most humans, is probably a
nonsensical request, but on Unix systems, you might want to get file
descriptor numbers. This would necessitate opening the file, which is a
stateful request. Supporting this kind of query would almost certainly
render a tool useless due to false positives.
If we look at our classic friend, in C and C++, const char *, note that
there tend to be about 4 distinct semantic types that this type refers
to. They are the following:
1. Raw binary data
2. Pure ASCII data, so it should only contain \x01-\x7f.
3. Native platform charset (what you can, e.g., pass into filesystem APIs)
4. Proper UTF-8
Sometimes, functions don't care which of these semantic type is in use,
assuming they're all null-terminated anyways (C's strchr is an example).
Sometimes, though, it matters hugely which definition is in use
(converting to/from UTF-16). The answer I as a human would give for
conversion functions depends heavily on context.
That said, I don't know if people have written papers on this topic
before; you might find software engineering conference archives useful
in this regard.
--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth