This came up in the last sprint planning and again on the Discord channel. We also already discussed this when we rewrote the option parser. Back then we decided against it as no string options are currently used in the planner. But it seems to be a feature that is used in several branches and forks, so I think YAGNI doesn't apply.
I'll copy my answer from Discord which has some hints of where to start:
---
We have this on our backlog as something that we want to do eventually (FD-9). One thing we have to work out there is how strings should be quoted and how quotation should be escaped. So far, we stuck closely to Python syntax (e.g., we use [] for lists, we use keyword args, etc.) so having a a string literal like "foo\"\\bar" or 'foo\'\\bar' to represent foo"\bar would make sense.
We then would need a separate TokenType for string literals (token_stream.h) and have to adapt the lexer to parse quoted strings correctly (construct_token_type_expressions and possibly split_tokens in lexical_analyzer.cc). With quotation, I don't think we'll have any issues with string tokens like "(" "infinity", "3.14", "true", "4k", "let", "SILENT", etc. which would otherwise clash with other types.
As the next step, the syntax analyzer should parse Tokens of the new type into LiteralNodes. I don't think we need a new class or any special handling here. We can just add the new type to the switch in parse_node to be treated in the same way as BOOLEAN, INTEGER, and FLOAT.
In LiteralNode::decorate(), we then have an additional case, where we have to create a DecoratedASTNode. I wouldn't abuse SymbolNode for this but rather create a new class StringLiteralNode analogous to BoolLiteralNode, IntLiteralNode, FloatLiteralNode.
In LiteralNode::get_type(), we have to return the correct type, which means we have to register it first (insert_basic_type<std::string>(); in TypeRegistry::TypeRegistry()).
I think this is it and the only complicated part in there is to parse quoted strings correctly. For a quick hack to just get this working for most cases, you could use something like this in construct_token_type_expressions (untested):
{TokenType::STRING_LITERAL, R"("[^"]*")"}
This would not allow " inside a string literal, and thus also no escaping but it would probably be fine for most cases.
|