Putting the R in REPL: Reading, Tokenizing and Parsing


  1. Background

Getting User Input

This article is written for version 0.1.3. If you don't want to check out the source (see the installation instructions), you can look at the files I'm talking about, src/main.rs, and src/lib/tokenize.rs.

I'm using rustyline to handle line editing. I'm trying to write an interpreter, not to entirely reinvent the world. Line editing is an important quality of life feature, and I recommend that you use one for an interactive programs you write. Only successful results are added to the history, EOF on a blank first line should signify a quit action, and the user can say they want to quit in a number of ways. Let's look at the read function and some preamble:

use rustyline::error::ReadlineError; use rustyline::Editor; use lib::environment::Environment; const EXIT_STRINGS: &[&str; 3] = &[",q", "exit", "quit"]; fn read<T: rustyline::Helper>(rl: &mut Editor<T>) -> Result<String, String> { let mut readline = rl.readline(">> "); let mut full_line = String::new(); let mut loop_count = 0; loop { match readline { Ok(line) => { if loop_count > 0 { full_line.push_str(" "); } full_line.push_str(&line); match count_parens(&full_line) { Ok(true) => return Ok(full_line), Ok(false) => readline = rl.readline("... "), Err(_) => return Err("unexpected \")\"".to_string()) } }, Err(ReadlineError::Interrupted) => return Err("user interrupt".to_string()), Err(ReadlineError::Eof) => { if loop_count > 0 { return Err("end of file".to_string()); } else { return Ok(EXIT_STRINGS[0].to_string()); } }, Err(err) => return Err(format!("{:?}", err)) } loop_count += 1 } }

The first thing to note is a static array of predefined strings that all mean quit the interpreter. I always implement this part of an interactive program first, because nothing annoys me more than not being able to quit a program.

Examine the setup of the read function. The prompt is initially set to ">>". This could be set to something more customizable, and that may be a future feature. There is also a loop counter, which is significant because the behavior depends on if the input is from the first line or not.

Finally, the actual meat of the function is inside a potentially infinite loop. After a user inputs some text for the first time, that input is checked to see if it is Ok, or an Error has occurred - and if an error is an EoF error (i.e. CTRL+D), how to handle that based on how many lines the user has input. For Ok inputs, there is one more validation before the input is finally accepted.

fn count_parens(s: &str) -> Result<bool, ()> { let mut depth: isize = 0; for c in s.chars() { match c { '(' => depth += 1, ')' => depth -= 1, _ => {} } if depth < 0 { return Err(()) } } Ok(depth == 0) }

This functiond does two things: first, it makes sure that parentheses are not imbalanced incorrectly (e.g., the S-Expression "())" would return Err()), and if they are balanced, to indicate as much. Until the input is balanced, this function will return Ok(false), or Err(()). If input is unbalanced after the first line, the prompt is changed to "... ". Once a valid fully balanced input has been entered, that input string is returned.

fn main() { let mut env = Environment::new(); let hist_file = "history.txt"; // `()` can be used when no completer is required let mut rl = Editor::<()>::new(); if rl.load_history(hist_file).is_err() { println!("No previous history."); } loop { let input_line = read(&mut rl); match input_line { Err(e) => { println!("Error: {}", e); continue; } Ok(expr) => { if EXIT_STRINGS.iter().any(|&s| s == expr) { break; } rl.add_history_entry(expr.as_str()); println!("{}", eval(&mut env, &expr)); } }; } rl.save_history(hist_file).unwrap(); }

Now, let's look

fn eval(env: &mut Environment, input: &str) -> String { let sexp = match lib::tokenize::tokenize(input) { Ok(x) => x, Err(f) => return f }; let res = lib::eval::eval(&sexp, env); match res { Ok(x) => format!("{}", x), Err(f) => format!("Error: {}", f) } }

Return to the index.