COMPSCI 220 Programming Methodology

Exercise Assignment 06: Regexes

Overview

This week’s lectures talked about Regular Expressions (“Regex”), a powerful tool for analyzing text. In this assignment, you will practice writing regular expressions by writing regexes for validating small strings. In order help you make sure your regular expressions are correct, you will be writing the regular expressions in Scala.

You may find the following links helpful:

Learning Goals

Test Files

In the src/test/scala directory, we provide ScalaTest test suites that will help you keep on track while completing the assignment. We recommend you run the tests often and use them to help create a checklist of things to do next. But, you should be aware that we deliberately do not provide you the full test suite we use when grading.

We recommend that you think about possible cases and add new test cases to these files as part of your programming discipline. Simple tests to add will consider questions such as:

More complex tests will be assignment-specific. To build good test cases, think about ways to exercise functions and methods. Work out the correct result for a call of a method with a given set of parameters by hand, then add it as a test case. Note that we will not be looking at your test cases (unless otherwise specified by the assignment documentation), they are just for your use and will be removed by the auto-grader during the evaluation process.

If you modify the test cases we provided or you added your own it is important to know that they will not be used. The auto-grader will use its own copy of the public and private tests. If you modify any source files in the src/test/scala directory your changes will not be reflected in the grading of your submission.

Before submitting, make sure that your program compiles with and passes all of the original tests. If you have errors in these files, it means the structure of the files found in the src directory have been altered in a way that will cause your submission to lose some (or all) points.

Project Structure

The project should normally contain the following root items:

Testing, Grading Assistant, and Console

As mentioned previously, you are provided a set of unit tests that will test various aspects of your implementation. You should get in the habit of running the tests frequently to see how you are doing and to understand where you might be going wrong. The ScalaTest testing framework is built-in to the activator tool and you can easily run the tests by issuing the following command from the command line (Mac/Linux):

./activator test

For Windows users you would issue the following command from the command window:

activator.bat test

This will compile your code and run the public ScalaTest unit tests. After you compile and run the tests you will notice that a target directory has been created. The target directory contains the generated class files from your source code as well as information and results of the tests. Activator uses this directory so it must not be removed. After you run the tests you can get a grade report from the Jeeves tool by issuing this command from the command line (Mac/Linux):

scala -cp tools/grading-assistant.jar autograder.jeeves.Jeeves

For Windows users you would issue the following command from the command window:

scala -cp tools\grading-assistant.jar autograder.jeeves.Jeeves

Note, issuing the above command from the activator console will not work! This will print a report to the console showing you the tests you passed, tests you failed, and a score based on the public tests. Although this gives you a good estimate as to what your final score might look like, it does not include points from our private tests. You may run Jeeves as often as you like, however, you must run the tests before your run Jeeves to give you an updated result.

Another very useful approach to test and play with the code you write is the activator/Scala console. You can run the console with this command (Mac/Linux):

./activator console

For Windows users you would issue the following command from the command window:

activator.bat console

This will load up the Scala REPL (read-eval-print-loop). You can type code directly into the console and have it executed. If you want to cut and paste a larger segment of code (e.g., function declaration) you simply type :paste in the console, then paste in your code, then type control-D.

In addition, you can run activator without any commands and you will be presented with the following prompt:

./activator
>

Or on Windows:

activator.bat
>

From the activator prompt you can type in any of the following commands:

Editors and IDEs

You are welcome to use any editor or IDE you choose. We recommend using a basic text editor such as Atom, SublimeText, Notepad++, emacs, or vim.

If you use a text editor you should use activator in a separate terminal window to compile, run, and test your code.

If you have successfully installed and imported your project into IntelliJ you are welcome to continue using it.

Part 1: A first task

All of your code must be written in src/main/scala/regex/Regex.scala.

One of the many applications of regular expressions is validating data. That is, given a piece of data, you can use a regular expression to make sure it is in the correct format:

val input = "23/11/2002" // A date: January 23rd, 2002
dateRegex findFirstIn input match {
    case Some(date) = "input is valid"
    case None => "input is invalid"
}
// output: "input is valid"

In src/main/scala/regex/Regex.scala, there are 6 regular expressions that you need to fill out:

Phone numbers: The phone regex validates phone numbers. Valid phone numbers are formatted as XXX.XXX.XXXX or XXX-XXX-XXXX, where X is a digit (0-9).

Basic email addresses: The email regex performs basic validation of email addresses. It checks that the email address has the format:

name@domain.tld

Advanced email addresses: The advEmail regex performs more thorough email address validation. It checks that the email has the format:

name@domain.tld

Basic dates: The date regex validates dates, but only for very simple conditions. Valid dates have the format dd/mm/yyyy, where d, m and y are digits. 23/11/2002 is valid, as is 45/00/9999.

Advanced dates: Obviously, 45/00/9999 should not be a valid date! The advDate regex makes sure that the days and months in the date string are in the correct ranges: Given "dd/mm/yyyy", dd must be between 01 and 31 and mm must be between 01 and 12. You do not need to make sure the days correspond to the month, that is, "31/02/2015" is a valid date, even though there are never 31 days in February. (However, you’re welcome to try!)

Names: The name regex validates names. Names are somewhat complex, and have multiple parts, some of which are optional:

[title] firstName [middleInitial] lastName [suffix]

Be very careful of the spacing between elements in the name!

Yes, we’re obviously leaving out many possible names here – the requirements are simplified for this exercise! See below for a better discussion about names.

You may find it easier to break these down into smaller pieces, then combine them later. For example, consider advDate to be 3 smaller regular expressions that are joined by slashes.

Note: The test suite, located in src/test/scala/regex/RegexTestSuite.scala, contains several examples of valid and invalid strings. Each test has a list of test cases, given as tuples (String, Boolean). The Boolean will be true whenever the String is a valid input. You may find these useful for figuring out what your regex needs to accept.

Note: While Regular Expressions are powerful and useful, they actually aren’t that good for validation! Data tends to be much more complex than we expect it to be, and your regular expressions end up either rejecting valid data or growing so complex as to be unusable. Here’s some good links discussing this topic:

Generally speaking, regular expressions are useful when you need to validate or extract data in small, well-known formats.

Submission

When you have completed the changes to your code, you must run the following command to package up your project for submission (Mac/Linux):

scala -cp tools/grading-assistant.jar submission.tools.PrepareSubmission

On Windows:

scala -cp tools\grading-assistant.jar submission.tools.PrepareSubmission

This will package up your submission into a zip file called submission-DATE.zip, where DATE is the date and time you packaged your project for submission. After you do this, log into Moodle and submit the generated zip file.

After you submit the file to Moodle you should ensure that Moodle has received your submission by going to the activity page for the assignment and verifying that it has indeed been uploaded to Moodle. To further ensure that you have uploaded the correct zip file you should download your submission from Moodle and verify that the contents are what you expect.

We do not allow re-submissions after the assignment due date even if you failed to submit the proper file or forgot. There are no exceptions so be very sure that you have submitted properly.

You should test that you can run the PrepareSubmission tool early so you do not have trouble packaging your assignment up for submission minutes before the deadline.