GitHub - stanfordnlp/chirpycardinal: Stanford's Alexa Prize socialbot (original) (raw)

chirpycardinal

Codebase for chirpy cardinal

Getting Started

How the code is organized

agent: When you run chirpycardinal, you will create an agent. Agents manage data storage, logging, user message input, bot message output, connections to remote modules, and calls to the handler. Three agents are provided:

servers: Contains the code needed to run chirpycardinal servers

chirpy: This directory contains the bot’s response generators, remote modules, and dialog management. The core logic of the bot is here. Code in this directory is invariant of agent specifications.

chirpy/annotators When a user utterance is input, all annotators are run on it and their results are stored in state, so that they can be used by the response generators. Annotations include dialog act and user emotion, among others.

chirpy/core The bot’s core logic components. Highlighted files are:

chirpy/response_generators: Contains all response generators used by the bot. More detail can be found in the Creating a Response Generator section

docker: This is where the dockerfiles, configs, and lambda functions of each remote module are defined.

scrapers: Scrape data from Twitter and Reddit, so that it can be stored in elastic-search

test: Integration tests for chirpy. These can be run with the command sh local_test_integ.sh

wiki-es-dump: Processes and stores raw wiki files for use by the response generators. wiki-setup.md contains detailed instructions for this step.

Creating an Agent

Agents manage the bot’s data storage, logging, message input/output, and connections to remote modules. The agent class provided, local_agent.py stores data locally and inputs/outputs messages as text. By defining your own agent, you can alter any of these components, for example storing data in a Redis instance, or inputting messages as audio.

Highlighted features of the LocalAgent are:init function, which initializes

Creating a new Response Generator

To create a new response generator, you will need to

  1. Define a new class for your response generator
  2. Add your response generator to the handler
  3. (optional) Structure dialogue using treelets

Defining a Response Generator class

You will need to create a new class for your response generator. To do this,

  1. Create a file my_new_response_generator.py in chirpy/response_generators which defines a MyNewResponseGenerator class
  2. Set the class’s name attribute to be 'NEW_NAME’
  3. Define the following functions of your class:

Adding a Response Generator to the Handler

In order for your response generator to be called, it needs to be added to a) your handler and b) the response priority list. To do this,

  1. Add MyNewResponseGenerator to your handler’s list response_generator_classes in your agent. If you’re using the local agent, you would add this to local_agent.py
  2. Using the name you declared in your response generator class, set the following in response_priority.py:

Using Treelets

If your response generator has scripted components, then you may want to use treelets. Treelets handle branching options of a scripted response generator. Based on a user’s response, one treelet can determine which treelet should go next. This value is stored in the response_generator’s conditional_state. To see an example of how this works in code, look at categories_response_generator.py, categories/treelets/introductory_treelet.py, and categories/treelets/handle_answer_treelet.py.

Running Chirpy Locally

Clone Repository

git clone https://github.com/stanfordnlp/chirpycardinal.git

Set CHIRPY_HOME environment variable

  1. cd into the chirpycardinal directory2
  2. Run pwd to get the absolute path to this directory, e.g. /Users/username/Documents/chirpycardinal
  3. Add the following 2 lines to ~/.bash_profile:
  1. Run source ~/.bash_profile

Set up ElasticSearch Indices and Postgres database

  1. cd into wiki-es-dump/ where the below scripts are located
  2. Follow the instructions in wiki-setup.md to
  1. Set up the twitter opinions database (Skip this step if you don't need the opinions resonse generator

Configure credential environment variables

Configure the credentials for your es index as environment variablesStep 1: copy the following into your ~/.bash_profileexport ES_PASSWORD= your_passwordexport ES_USER=your_usernameexport ES_REGION=your_regionexport ES_HOST=your_hostexport ES_SCHEME=https export ES_PORT=your_port

Step 2: run source ~/.bash_profile

Replace credential in chirpy/core/es_config.json

“url”: your_es_url

Set up the chirpy environment

  1. Make a new conda env: conda create --name chirpy python=3.7
  2. Install pip3 --v19.0 or higher
  3. cd into your new directory
  4. run conda activate chirpy
  5. run pip3 install -r requirements.txt

Install docker, pull images

Install dockerPull images from our dockerhub repositories

docker pull openchirpy/questionclassifier
docker pull openchirpy/dialogact
docker pull openchirpy/g2p
docker pull openchirpy/stanfordnlp
docker pull openchirpy/corenlp
docker pull openchirpy/gpt2ed
docker pull openchirpy/convpara

These images contain the model files as well. The images are large and can a while to download. We would recommend having 24G of disk space allocated to docker (otherwise it'll complain about the disk space being full).

Run the text agent

Run python3 -m servers.local.shell_chatTo end your conversation, say “stop” If the docker images don't exist (you didn't download them in the above step), the script will attempt to build them which might take a while.

Building your own docker images

Depending on which docker module you want to rebuild you would have to download one of the following models. Then run the respective Dockerfile to build there. There are issues with the python package versioning. Huggingface transformers has gotten breaking changes since we wrote the code, so the code needs to be updated, but that will likely not happen immedietly but might happen with next release.

Download and store models

  1. Add a model/ directory to docker/dialogact, docker/emotionclassifier, docker/gpt2ed, and docker/questionclassifier
  2. Download and unzip models in this folder, and move them into the chirpycardinal repo

License

The code is licensed under GNU AGPLv3. There is an exception for currently participating Alexa Prize Teams to whom it is licensed under GNU GPLv3.