CMU-CS 15-415/615 - Homework #7

Important - Do not install any extra packages in the VM

We will use the VM settings specified in the Vagrantfile provided by us to run and test your program. The python packages we have pre-installed in your VM includes the following ones:

Django          # The django framework
psycopg2        # The Postgres Python adapter
textract        # The python package used to extract text from pdf
pytz            # The python package to get datetime of specific timezone.
These are enough for you to implement this project. Please make sure your Python program is runnable under the environment that we provide you. You will get zero points if your program has external dependencies that prevent it from running in our VM setting.

Download and Install the Software

In this homework, all development and testing will be done in Linux using a VirtualBox virtual machine. You will use Postgres as the database backend and Django framework as the application front end. You do not need to know Django; we have already configured everything for you. We will use Python 2.7 with psycopg2 to implement all the databse APIs. Your task is to implement the APIs between the frontend and the database.

To get started, first thing that you need to do is to setup the VM software. We are going to download and install VirtualBox and Vagrant. VirtualBox is a general-purpose full virtualizer for x86 hardware, targeted at server, desktop and embedded use. And Vagrant is a toolkit for VirtualBox. Follow the following steps:

  1. Download and install VirtualBox from this website. Choose the proper platform packages accoding to your operating system.
  2. Download and install Vagrant from this website. Follow the instruction in the website.

Virtual Machine Setup

Vagrant will create a complete development environment in the VM for this assignment. You need to download the configuration files. Untar the file using the following command:

$ tar xvf cmupaper_vm.tar

Enter the directory cmupaper_vm. There is only one file inside the directory named as Vagrantfile. This is the configuration file that Vagrant uses to setup the virtual environment. Let's launch our VM and starts playing with it by the following steps. Make sure you have already installed VirtualBox and Vagrant on your local machine.

  1. Start the VM using the the following commands:
    $ cd cmupaper_vm # Enter the vm configuration directory
    $ vagrant up     # Launch the VM
    The last command will provision the new VM for you. Make sure you have network access during the provisioning. It will automatically install PostgrestSQL and setup the Python virtual environment needed to develop the website. This should take around 5-10 min.
  2. After the VM has successfully started, you can control it with following commands. You have to execute these commands in the cmupaper_vm in the host machine (i.e., not inside of the VM).
    • To login to the VM:
      $ vagrant ssh
    • To shutdown the VM:
      $ vagrant halt
      Note that after shutdown, all your files are still stored in the VM. You can access them again if you boot up your VM by vagrant up later.
    • To remove the VM (USE WITH CAUTION):
      $ vagrant destroy
      By doing this, you will lose the entire VM
    • To reset the VM to the inital state (USE WITH CAUTION):
      $ vagrant reload
      You will lose all your data on the VM after doing this. Use this command if you mess up your VM environment.

Deploying Web Application Inside the VM

Now, let's try to login to the VM to checkout the actual source file of Phase2. First login to your VM:
  1. $ cd ./cmupaper_vm # Go to the vm directory in your localhost
  2. $ vagrant up # Boot up your VM if you haven't done so
  3. $ vagrant ssh $ Login to the VM
Inside your VM, you will find a directory ./cmupaper/ under your home directory. It contains all the source code for our database application. This directory should have the following layout:
./cmupaper/
./cmupaper/manage.py           # The python script to start the web server
./cmupaper/simple_checker.py    # A simple sanity checker provided by us to test your API implementations
./cmupaper/hw7proj/             # The configuration of the web application
./cmupaper/media/               # The directory storing all uploaded data from clients
./cmupaper/paper/               # The direcory containing all the source code of our project
./cmupaper/paper/funcsionts.py  #This is the only python file you are going to modify
./cmupaper/paper/[otherfiles]   # Other python source files and html files to run the application

Note that, all APIs betweent the web frontend and the database are defined in ./cmupaper/paper/functions.py. This is the only file you are going to modify.

Altough we have not implemented any database APis yet, the web site is runnable! Let's first try it out:

$ cd ./cmupaper # Go to the project directory inside the VM
$ python manage.py runserver 0.0.0.0:8000 # Run the web server
If every thing goes well, you will see something like this indicating that your webserver is running correctly:
$ python manage.py runserver 0.0.0.0:8000
Performing system checks...

System check identified no issues (0 silenced).
October 29, 2016 - 21:17:03
Django version 1.10.1, using settings 'hw7proj.settings'
Starting development server at http://0.0.0.0:8000/
Quit the server with CONTROL-C.

Note that it's important to specify you IP address as 0.0.0.0 and port number as 8000. Otherwise, you can not access to the webserver inside the VM via the NAT configuration setup by Vagrant on your localhost.

Now that the webserver is running, you can use the web browser on your local to visit the website. Open any browser you like, access the website by 127.0.0.1:15415. If everything goes well, you will see a login pager. The reason why we use port 15415 instead of port 8000 is because with configured Vagrant's NAT setting to forward the traffic from port 15415 outside the VM to port 8000 inside of it.

Database Functions API

At this point, the website is useless and the only thing you can do is typing random login messages. Now we are going to make it more useful by implementing all the database APIs defined in ./cmupaper/paper/functions.py. Note that, this file is the only file you need to modify in order to run the website. Now let's take a look it this file. Checkout hints here to help you transfer files between your VM and your localhost.

In this file, we have defined about 20 APIs for you. We have included detailed inline comments describing the input and return value for each API.

Generally, the web application will invoke these APIs by a database wrapper (defined in ./cmupaper/paper/database_wrapper.py). The database wrapper will establish a database connection, pass it to the API and close the connection after the API call. Therefore, you do not need to create and terminate the connection by yourself

Each API has the return value following the pattern of (status, retval). While status indicates whether the API call successes or not, the retval is the actual return value of the API. Please read the inline comments of every API carefully follow the return value format.

Testing

As we have said, your task is to implement and test all of the APIs defined in ./cmupaper/paper/functions.py. We also provide a simple sanity checker. You can run it under the ./cmupaper directory by the following command:

$ python simple_checker.py
It will check the basic correctness and output format of APIs you have implemented. Note that, it does not test for all corner cases. Even if you have passed all the sanity tests, you may still fail in our final test. We encourage you to develop your own test cases. Please checkout the hints to debug and test your program.

Since we are using Postgres in a Python environment, we are going to use psycopg2, a python adapter of Postgres, to connect our application to the backend. We have provided a simple example on how to use psycopg2 in ./cmupaper/paper/functions.py. Checkout the official documentations and tutorials here

Note that, you have to use the schema provided by us. Otherwise, our auto graders may not be able to grade your program and give you 0 points.

After you have implemented the reset_db() API, you can reset and install your databases by the following steps:

  1. Launch the webserver in the VM
  2. Access the following URL in your local browser: 127.0.0.1:15415/reset and the web server will invoke your reset_db() API.

Submitting

Hard copy

A printed version of your ./cmupaper/paper/functions.py. Don't forget to include your name and andrewId in the header comment of the source file.

Blackboard submission

A tar file named [andrew_id]_hw7_phase2.tar, it should contain the following things:

./functions.py                  # Your implementation of all the database APIs
./customize_checker.py          # Your customized checker to test your implementation.
./ReadMe                        # A brief introduction on how to run your checker
Again, Don't forget to include your name and andrewId in the header comment of the source file.

Auto graders

The way our auto grader works it that it first launch exactly the same VM you use for this project and then run test scripts against your submitted files. Basically, what we will do is similar to the simple_checker.py. However, we will perform more sophisticated tests to test against possible corner cases. We will also directly contact to the Postgres database to make sure you use the database correctly.

Hints

Testing advice

You can test your implementations by the following approaches:

  1. Try out your website from your web browser and see it it acts as expected.
  2. Expand the simple sanity checker, let it perform some more sophisticated tests.
While #1 seems easier, you may need to look into the web application code to define what are the expected actions. Approach #2 will take some time to implement the test itself, but since you can run regression tests, it can greatly boost your confidence in the correctness of your program.

Debugging suggestions

Once you get Python exceptions from your program, or your program does not act as expected, you can debug in the following ways:

  • Add print statement in to your program and check the printout messages from the terminal to see what happened internally. However, remember to delete all these debug output before you submit your source code. These debug outputs may confuse our auto graders
  • Use the psql console to check whether you have done the right thing in the database.

File transfer between VM and your localhost

The local directory containing the Vagrantfile used to launch your VM is always synced to the /vagrant directory inside your VM. That is, whatever you put under your local directory ./cmupaper_vm will appear in the /vagrant inside your VM and vice versa. Note that /vagrant is under the root directory of your VM

Checkpoint your progress

In case you occasionally delete your VM or mess it up, please always keep a copy of your current progress in your local machine. We strongly suggest that you use svn or git to do version control.

Useful links