Creating Sentiment Line Chart for the News with Watson, Python, and D3js

Requirements:
You must have Python installed. Check to see if you have Python installed from the commandline:
python --version

Table of Contents:

  1. Create a Starterapp
  2. Git Clone and Setup
  3. Create Additional Folders and Files
  4. Commit and Push Changes to Repository, Build and Deploy
  5. Create an AlchemyAPI service
  6. Create a Cloudant NoSQL service
  7. Add the Basic Workflow for Request-Response
  8. Get News using AlchemyAPI
  9. Create Helper Functions
  10. Save Responses in Cloudant
  11. Parse Response for D3js
  12. Draw the Line Chart in D3js

You can import AlchemyAPI requests into Postman with this Postman collection.

The source code for the application can be viewed or cloned from Github.

1. Create a Starterapp

  1. Go to Catalog > Boilerplates
  2. Click the ‘Python Flask’ starterapp
  3. For name enter <username>-newssentiment
  4. Go to Overview
  5. Under ‘Continuous Integration’ click ‘Add GIT Repo and Pipeline’ to add a DevOps platform, select ‘Populate the repo with the starterapp packageand enable Build & Deploy pipeline’ > Click Continue > Click ‘CLOSE’.
  6. Click ‘EDIT CODE’.
  7. The very first time you login to the ‘DevOps’ environment you will need to pick a username for the ‘DevOps’ environment.
  8. In the left menu of icons, click the top folder icon, and click ‘Git URL’ to copy the Git repository url.
  9. If you prefer to edit in the online editor in Bluemix, click ‘EDIT CODE’ button and then click the second pencil icon in the left menu of icons.
  10. I will continue to work on localhost instead.

2. Git Clone and Setup

  1. To work on your localhost, from the commandline clone the remote repository from bluemix DevOps to your localhos:
    git clone https://hub.jazz.net/git/<username>/remkohdev-newssentiment
    cd remkohdev-newssentiment
  2. Setup the virtualenv and activate virtualenv:
    virtualenv venv
    . venv/bin/activate
  3. install Python3
    virtualenv -p python3 venv
  4. Install flask package into the right runtime, i.e. ‘python3.5’ thus using ‘pip3.5’:
    pip3.5 install flask
  5. Run Flask as follows with ‘python3.5’:
    export FLASK_APP=welcome.py
    python3.5 -m flask run

Open a browser and go to http://localhost:5000. You should see the following startup app.
bluemix-newssentiment

3. Create additional folders and files

Open the editor:

  1. Open the ‘Procfile’ and verify the startup command to ‘web: python welcome.py’,
  2. To create a new folder in the online editor, click: File > New > Folder from the top menu ‘templates’,
  3. Create the following folders:
    • ‘~/templates’
    • ‘~/static/js’
    • ‘~/static/js/d3js’
    • ‘~/static/js/jquery’
    • ‘~/static/js/jquery-validation’
    • ‘~/mymodule’ , a module folder to hold custom packages.
  4. To create a new file in the online editor, click: File > New > File,
  5. Create the following files:
    • ‘~/.gitignore’,
    • ‘~/.cfignore’,
    • ‘~/templates/index.html’,
    • ‘~/templates/report.html’,
    • ‘~/mymodule/myalchemyapi.py’,
    • ‘~/mymodule/mycloudant.py’,
    • ‘~/mymodule/__init__.py’ , to make the ‘~/mymodule’ folder a Python module you have to create an empty file named ‘__init__.py’.
  6. I am using version 3 of D3js, though recently version 4 was published. Download and unzip ‘https://github.com/d3/d3/releases/download/v3.5.17/d3.zip’ to the ‘~/static/js/d3js’ folder.
    In the online editor, you can import libraries via the import feature. Select the ‘~/static/js/d3js’ folder, then from the top menu, click File > Import > HTTP, paste ‘https://github.com/d3/d3/releases/download/v3.5.17/d3.zip’ and click ‘Submit’,
  7. Also download and install JQuery and JQuery-Validation.
    • Download ‘https://code.jquery.com/jquery-3.1.1.min.js’ to the ‘~/static/js/jquery’ folder.
    • In the online editor, select the ‘~/static/js/jquery’ folder, then from the top menu, click File > Import > HTTP, paste ‘https://code.jquery.com/jquery-3.1.1.min.js’ and click ‘Submit’,
    • Download and unzip ‘https://github.com/jzaefferer/jquery-validation/releases/download/1.15.1/jquery-validation-1.15.1.zip’ to the ‘~/static/js/jquery-validation’ folder.
    • Select the ‘~/static/js/jquery-validation’ folder, then from the top menu, click File > Import > HTTP, paste ‘https://github.com/jzaefferer/jquery-validation/releases/download/1.15.1/jquery-validation-1.15.1.zip’ and click ‘Submit’.

4. Commit and Push Changes to Repository, Build and Deploy

From localhost using git:

  1. Add the following directories to ‘~/.gitignore’ and ‘~/.cfignore’
    venv/
    __pycache__/
  2. Verify the files that have changed
    git status
    You should see
    .gitignore
    mymodule/
    static/js/
    templates/
  3. Add the files to the git index
    git add *
  4. Commit the files for push to the remote master branch
    git commit -m "setup"
  5. And push the files to the remote master
    git push

From localhost using the CloudFoundry cli:

  1. Make the changes to ‘~/.gitignore’ and ‘~/.cfignore’ as above,
  2. Make sure the cf cli is installed or install it from ‘https://github.com/cloudfoundry/cli/releases’,
  3. Connect to Bluemix
    cf api https://api.ng.bluemix.net
    You will be asked to login with your Bluemix username (registration email) and password, and select a space.
  4. Push changes to Bluemix
    cf push

From the online editor:

  1. In the left menu of icons, click the third Git icon,
  2. In the right commit message box, type ‘init1’ and click ‘Commit’,
  3. In the left, in the ‘Outgoing’ section, make sure the commits appear and click ‘Push’,
  4. Go to the top right ‘BUILD & DEPLOY’, the default Build stage and Deploy stage should be triggered,
  5. If an error occurs, from the commandline on your localhost, use ‘cf login’ and ‘cf logs –recent <username>-newssentiment’ to find detailed log output,
  6. Alternatively to using the online Git, you can use in the online editor next to the application status box, the play button to ‘Deploy the App from the Workspace’.

5. Create an AlchemyAPI service

  1. Go back to the application overview page,
  2. In ‘Connections’ in the ‘new-console’, click ‘Connect New’, or in classic ‘console’, click ‘ADD A SERVICE OR API’,
  3. This takes you to the Catalog, select the ‘Watson’ category, search and click AlchemyAPI,
  4. In ‘Service Name’ enter ‘<username>-newssentiment-AlchemyAPI’, make sure in the left under ‘Connect to’ the application you created is selected, select the ‘Free’ plan and click ‘CREATE’, and ‘RESTAGE’.
  5. In the ‘new-console’, go to ‘Runtime’ and ‘Environment Variables’, in classic console go to ‘Environment Variables’. Under ‘VCAP_SERVICES’ you will find the credentials of the services or connections, with these you can access the AlchemyAPI.

6. Create a Cloudant NoSQL service

  1. Go back to the application overview page again,
  2. In ‘Connections’ in the ‘new-console’, click ‘Connect New’, or in classic console, click ‘ADD A SERVICE OR API’,
  3. This takes you to the Catalog again, select the ‘Data and Analytics’ category, and click ‘Cloudant NoSQL DB’,
  4. In ‘Service Name’ enter ‘<username>-newssentiment-Cloudant’, make sure in the left under ‘Connect to’ the application you created is selected, select the free ‘Lite’ plan and click ‘CREATE’, and ‘RESTAGE’,

7. Add the Basic Workflow for Request-Response

Now we are ready to code and connect all the pieces.

I will:

  1. Change the ‘index.html’ and ‘report.html’ pages,
  2. Change code for the ‘/’ route, and create a ‘/search’ route.
  • In your editor, open the ‘welcome.py’ file,
  • look for the line ‘@app.route(‘/’)’ around line 20, and change the return value to return render_template('index.html')
  • At the top of the file add the ‘render_template’, ‘Response’ and ‘request’ packages to the import from flask’,
    from flask import Flask, jsonify, render_template, Response, request
  • Remove the code for the routes ‘/myapp’, ‘/api/people’, ‘/api/people/‘.
  • Add the ‘/search’ route to the ‘welcome.py’ file, so that your ‘welcome.py’ code now looks as follows:
  • Add the following code to the ‘~/templates/index.html’ file
  • Add the following code to the ‘~/templates/report.html’ file
  • In the online editor, deploy the app from the workspace. On localhost, restart the application
    python3.5 -m flask run
    and in your browser go to http://127.0.0.1:5000/.

Index.html
bluemix-newssentiment-index

Report.html
bluemix-newssentiment-report

8. Get News using AlchemyAPI

So far, the application allows a user to enter input information to do a news search, but the server just returns the input information.

Let’s take a look at the options for the AlchemyAPI News service.

  • Open a new browser tab and go to the AlchemyAPI Query Builder tool,
  • In ‘Search articles over’ select for startdate ‘August 1, 2016’ and for enddate select ‘September 16, 2016’, and click ‘Apply’,
  • Enter a ‘where’ search term, change ‘is mentioned’ to ‘anywhere’, change ‘and the Sentiment is’ to ‘Any’, select an appropriate Taxonomy, change ‘in the article’ to ‘body’,
  • In the Return section to the right, select all fields to enrich the search results with all NLP analytics, and ‘Run Query’,

Now, let’s add the Watson AlchemyAPI News service, so that the search form actually will return real news results.

  1. I am using a package called ‘http.client’, which is a Python3 package, so I have to change the Python version value in the ‘runtime.txt’ file to ‘python-3.5.1’,
  2. Add the following ‘import’ statement at the top of ‘welcome.py’ to import the custom packages of the ‘mymodule’ module we created earlier, implementing the interaction with the AlchemyAPI and Cloudant services.
    from mymodule import myalchemyapi, mycloudant
    import json
  3. In the ‘~/mymodule/myalchemyapi’ package, add a function ‘GetNews()’ as follows:
  4. Go to the application ‘Environment Variables’ page and in the ‘VCAP_SERVICES’ look for the ‘alchemy_api’ node, and copy the ‘apikey’ value in the ‘credentials’ node. Go back to the ‘~/mymodule/myalchemyapi’ file and paste the value in place of ‘<your-value>’
    apikey = ""
  5. In the ‘welcome.py’ file, change the ‘Search’ function, as follows:

Some comments to the new ‘welcome.py’ code:

  • I added two helper functions ‘myalchemyapi.FormatDate(startdateStr)’ and ‘myalchemyapi.ParseNews(articles, startdateStr)’, so we need to add these still to the ‘mymodule/myalchemyapi.py’ package. I will do that in the next step.
  • The AlchemyAPI GetNews API is more than just a News Search. The API combines news search with Watson NLP analytics in a single call. The real power comes from the input parameters of the API that add complex configuration options that can include multiple NLP analytics to filter search results.
    In the ‘returnfields’ input parameter, I set the ‘return’ parameter for the API, which defines what data and meta-data from the search results I want to retrieve. There is a full list of Return values in the API documentation, the full list is accessible in this Google spreadsheet with currently 419 options. For now, I am only going to retrieve 3 fields: title, publication date and the sentiment score of the overall document.
    The sentiment score add for each search result a Sentiment Analysis API call. This will add the following return field:
    "docSentiment": {
    "score": -0.112443998
    }

9. Create Helper Functions

Now, I will create the two helper methods ‘myalchemyapi.FormatDate(startdateStr)’ and ‘myalchemyapi.ParseNews(articles, startdateStr)’.

  1. First, add the following code to the ‘mymodule/myalchemyapi.py’ package:

10. Save Responses in Cloudant

I want to save all search results. First, I have to create a database in cloudant to store all search results.

  1. In the ‘new-console’ go to ‘All Items’ or in the classic console in the application overview page, under ‘Services’, click the ‘Cloudant NoSQL’ service, and click the green ‘LAUNCH’ button,
  2. In the left menu, click ‘Databases’,
  3. In the top right of the page, click ‘Create Dababase’, enter ‘newssentiment’ and click ‘Create’.

Next configure the Python application to use Cloudant to save the search results. I am using the Python Cloudant package.

  1. In your IDE, open the ‘~/mymodule/mycloudant.py’ file and add the following code:
    In the application overview page, go to the ‘Environment Variables’ page and in the ‘VCAP_SERVICES’ look for the ‘cloudantNoSQLDB’ service. Copy the credentials and configure the Cloudant credentials in the ‘~/mymodule/mycloudant.py’ file.
    username = "<cloudant_username>"
    password = "<cloudant_password>"
    cloudantURL = "<cloudant_url>"

    For the cloudantURL, use the format ‘<account>.cloudant.com’ without ‘username:password@’.
  2. The code in the ‘~/mymodule/mycloudant.py’ file, imports the Cloudant package from the ‘cloudant.client’ module, which is not yet installed:
    from cloudant.client import Cloudant
  3. To install the ‘Cloudant’ package for python3.5, from the commandline in the root of your project, run:
    pip3.5 install cloudant
    and make sure it gets installed in the ‘venv/lib/python3.5/site-packages/’.
    Note the installation results:
    Installing collected packages: requests, cloudant
    Successfully installed cloudant-2.1.1 requests-2.11.1
  4. Edit the file ‘requirements.txt’ to include the correct package version:
    Flask==0.10.1
    cloudant==2.1.1

    Bluemix uses the ‘requirements.txt’ file to install dependencies, instead of running from a ‘virtualenv’ environment like I do on localhost.
  5. Add the following line in the ‘mymodule/myalchemyapi.py’ package in the ‘ParseNews()’ function around line 70 before parsing the ‘result’:
    # Here everything is OK
    mycloudant.SaveNews(articles)
    sentimentList = []
    docs = articlesJson['result']['docs']
    for doc in docs:

Report.html with AlchemyAPI GetNews API results.
bluemix-newssentiment-report-walchemy

11. Parse Response for D3js

The ‘def ParseNews(articles=None, startdate1=None):’ method in the ‘mymodule/myalchemyapi.py’ package, parses the JSON document from the AlchemyAPI to a format that is used by D3js to load and draw the graph. D3js has its own functions to transform data, but I prefer to do it on the server.

The D3js format for the sentiment data is as follows:
[
    {"publicationDate": publicationDate, "sentiment": sentiment},
    {"publicationDate": publicationDate, "sentiment": sentiment}
]

The ‘ParseNews()’ method creates the correct format, but there are also multiple results per day. For a better display in the D3js graph, I average multiple scores for a single day, and sort the results ascending by date, so D3js will draw a logical line forward in time. The result is a list of unique dates with average sentiment per day.

12. Draw the Line Chart in D3js

I used Mike Bostock’s Line Chart as the base graph for the Sentiment by Day chart.

Add the following code to the ‘~/templates/report.html’ page.

To select elements and create new HTML elements, you can use d3 as follows:
var body = d3.select("body");
body.append("h1")
.text("News Sentiment for '"+respSearchTerm+"'");

D3js lets you define and nest functions, e.g.
var formatDate = d3.time.format("%Y-%m-%d");
var x = d3.time.scale().range([0, width]);
var y = d3.scale.linear().range([height, 0]);
var xAxis = d3.svg.axis().scale(x).orient("bottom");
var line = d3.svg.line()
.x(function(d) { return x(d.publicationDate); })
.y(function(d) { return y(d.sentiment); });
data.forEach(function(d){
d.publicationDate = formatDate.parse(d.publicationDate);
d.sentiment = parseFloat(d.sentiment);
});

In this example, I have passed the data a Response object from the Flask server, and parse the values to be of Object type date based on the strftime() function using the d3js function d3.time.format(), and the sentiment to be a floating Number with the default JavaScript function.

Similarly, when I draw the line with the function ‘d3.svg.line()’ I can set the x and y coordinates by a custom callback function, which itself calls the previously defined functions x and y which use a range() function. D3js uses a lot of these type of nested functions.

In this part:
svg.append("path")
.datum(data)
.attr("class", "line")
.attr("d", line);

D3js binds the data object to the svg object and in the attribute the ‘line’ function is called, which sets the x and y coordinates respectively by applying ‘d.publicationDate’ and ‘d.sentiment’ (see above).

Similarly, it draws an x-axis and y-axis as follows:
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end")
.text("Sentiment");

Leave a Reply

Your email address will not be published. Required fields are marked *