According to a StackOverflow survey, over 33% of the people learning to code, learn Java. Learning to code can be intimidating because:
Everything looks very vast and intimidating.
There are so many different approaches that one may not know where to start.
The start is too theoretical and learners struggle to map the theory and practical. And the "Hello World" program, does not help.
The best way to learn to code is to actually write code. So let's make a rudimentary sentiment analysis program in 15 minutes.
This will help you to learn the following concepts:
Data Type
String manipulation
Array
List
Map
Loops
File operation
What is sentiment analysis?
In this era of social media, every digital platform does some sort of sentiment analysis in place, to either boost or limit the reach of a post based on it's content.
How do they do it?
"You are doing nice work!"
We all agree that this sentence is said with positive sentiment. Why? Because there is the word "nice" in the sentence.
"So disgusting!"
This sentence radiates negative sentiment. Why? Because there is the word "disgusting" which has a negative connotation.
So, if we need to make a simple classifier, we can follow a simple process:
Make a file containing a list of words and classify the words as either positive or negative.
Read the file and store the word as the key and the sentiment as the value in a map.
Read the text sentiment and break it word by word and then check if each word exists in the map.
Count the sentiment when they are positive or negative.
If there are more positive words, then the sentence is positive.
If there are more negative words, then the sentence is negative.
We will be using some inbuilt Java classes and methods which might not be very intuitive to use for a beginner. We recommend that you start using them. Make the program work and then read more about it to deep-dive.
Code
Step 1
Create a class called SentimentAnalysis.java
Write a main method:
public class SentimentAnalysis {
public static void main(String[] args) throws Exception {
System.out.println("*** Hello World ***");
}
}
//Output
*** Hello World ***
Step 2
Create a file called sentiments.txt with the following content
positive=good,well,amazing,awesome,nice
negative=bad,sick,disgusting,worse,kill,death
Step 3
Read the file sentiments.txt, iterate the lines and fill the code in a map
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.HashMap;
import java.util.Map;
public class SentimentAnalysis {
public static final String POSITIVE = "positive";
public static final String NEGATIVE = "negative";
public static void main(String[] args) throws Exception {
// Hello World
System.out.println("*** Hello World ***");
// Read a file
String file ="sentiments.txt";
BufferedReader reader = new BufferedReader(new FileReader(file));
String currentLine = reader.readLine();
//Loop using "while" until there is nothing to read.
Map<String, String> wordToSentinents = new HashMap<>();
while(currentLine != null){
System.out.println(" Reading from file "+currentLine);
//Split line using "="
String[] sentiments = currentLine.split("=");
System.out.println(" First element of sentiments array "+sentiments[0]);
System.out.println(" Second element of sentiments array "+sentiments[1]);
System.out.println("================================");
fillMap(wordToSentinents, sentiments[0], sentiments[1]);
currentLine = reader.readLine();
}
reader.close();
}
private static void fillMap(Map<String, String> wordToSentinents, String sentiment, String words) {
String[] wordArr = words.split(",");
for(String word : wordArr){
wordToSentinents.put(word, sentiment);
}
System.out.println(" Map "+wordToSentinents);
}
}
Let's look at some of the Java inbuilt packages used here:
FileReader
Convenience class for reading character files. The constructors of this class assume that the default character encoding and the default byte-buffer size are appropriate.
BufferedReader
Reads text from a character-input stream, buffering characters to provide an efficient reading of characters, arrays, and lines.
The buffer size may be specified, or the default size may be used. The default is large enough for most use cases.
In general, each read request made by a Reader causes a corresponding read request to be made of the underlying character or byte stream. It is therefore advisable to wrap a BufferedReader around any Reader whose read() operations may be costly, such as FileReaders and InputStreamReaders. For example,
BufferedReader in = new BufferedReader(new FileReader("foo.in"));
will buffer the input from the specified file. Without buffering, each invocation of read() or readLine() could cause bytes to be read from the file, converted into characters, and then returned, which can be very inefficient.
Programs that use DataInputStreams for textual input can be localized by replacing each DataInputStream with an appropriate BufferedReader.
HashMap
Hashmap is a data structure that stores key-value pairs. The basic operation of get and put is in the order of O(1).
Step 4:
Add a method to analyse the sentence. Break the word and count the sentiments using the map. The code has to be run like this:
java SentimentAnalysis.java "I am a good day"
The analyseSentiment method breaks the sentence. Then iterate each word of the array and find if they exist in the map. If they exist in the map, count the sentiment and then print if the sentence is positive or negative.
Here is the complete code:
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.HashMap;
import java.util.Map;
public class SentimentAnalysis {
public static final String POSITIVE = "positive";
public static final String NEGATIVE = "negative";
public static void main(String[] args) throws Exception {
// Hello World
System.out.println("*** Hello World ***");
// Read a file
String file ="sentiments.txt";
BufferedReader reader = new BufferedReader(new FileReader(file));
String currentLine = reader.readLine();
//Loop using "while" until there is nothing to read.
Map<String, String> wordToSentinents = new HashMap<>();
while(currentLine != null){
System.out.println(" Reading from file "+currentLine);
//Split line using "="
String[] sentiments = currentLine.split("=");
System.out.println(" First element of sentiments array "+sentiments[0]);
System.out.println(" Second element of sentiments array "+sentiments[1]);
System.out.println("================================");
fillMap(wordToSentinents, sentiments[0], sentiments[1]);
currentLine = reader.readLine();
}
reader.close();
analyseSentiment(args, wordToSentinents);
}
private static void analyseSentiment(String[] args, Map<String, String> wordToSentinents) {
if(args.length != 0) {
String lineToTest = args[0];
System.out.println("!! Sentence to Test: " + lineToTest);
//Find how many negative word and how many positive word are there in this sentence
//Break the sentence on spaces
String[] words = lineToTest.split(" ");
//Iterate and count positive and negative word
int positive = 0;
int negative = 0;
for(String word : words){
if(wordToSentinents.containsKey(word)){
String sentiment = wordToSentinents.get(word);
if(sentiment.equals(POSITIVE)){
positive = positive + 1;
}
if(sentiment.equals(NEGATIVE)){
negative = negative + 1;
}
}
}
if(positive > negative){
System.out.println("It's a positive sentence");
} else if (positive < negative) {
System.out.println("It's a negative sentence");
} else {
System.out.println("It's a neutral sentence");
}
}
}
private static void fillMap(Map<String, String> wordToSentinents, String sentiment, String words) {
String[] wordArr = words.split(",");
for(String word : wordArr){
wordToSentinents.put(word, sentiment);
}
System.out.println(" Map "+wordToSentinents);
}
}
And that is how you can write your first sentiment analysis program in just 15 minutes.
Let us know what other projects you would like us to break down and cover in future articles.
Follow us on LinkedIn for shorter articles about basics of software engineering: LinkedIn for SkillCaptain
Good introduction. However, would love to see ML program here.