Text & Social Analytics Project

ChirpCheck - Bird Identifier

About this Project

ChirpCheck is a bird identification app that allows users to identify birds based on textual descriptions the user provides. The app uses a machine learning model to identify the bird species based on the user's input and provides information about the bird species such as its habitat, diet, and conservation status.

My Role and Solutions

As a the group leader of a 4 member team, I was responsible for the overall direction of the project and ensuring that the project is delivered on time. I was also responsible for the development of the machine learning models and the deployment and development of the application.

Work Process

Planning Phase

We decided to narrow down the scope of the project to only focus on 4 birds, the Javan Myna, Black-naped Oriole, Collared Kingfisher and Little Egret. We decided to do this as we wanted to ensure that we are able to deliver the project on time and to ensure that the machine learning model is able to identify the birds accurately. We plan to expand the number of birds in the future.

Data Collection & Pre-Processing

The collection process was done by scraping the internet for textual descriptions of the birds using the BeautifulSoup library in Python. This is done by scraping the description and of search results and the metadata of each site from search engines Google, Yahoo, AOL and Brave. We manage to 150-200 descriptions for each bird species.

The data is split into train and test, then pre-processed by removing any special characters and any stop words that may be present in the data. The data is then tokenized and converted into a TF-IDF matrix to be used as input for the machine learning model.

Model Development

Sci-kit learn was used to develop the machine learning model. 5 Models (Logistic Regression, Random Forest, Support Vector Machine, Naive Bayes and MLP) were trained and tested, and the best model was selected based on the accuracy and K Fold Cross Validation. An ensemble model was also created by combining the best models of Logistic Regression, Naive Bayes and MLP to improve the accuracy and robustness of the model.

The model was able to achieve an accuracy of 95% on the test data, and was then saved and used for the web application.

Web Application Development

The web application was developed using Flask, a micro web framework for Python. The application was deployed OnRender, a cloud platform that allows for easy deployment of web applications. The application was developed to be user friendly and easy to use, with a simple and clean user interface that is responsive and work on all devices.

Outcome

Overall, the project was a success as we were able to deliver the project on time with all the features implemented. In the future, we plan to expand the number of birds in the application and to improve the accuracy of the machine learning model. We also plan to add more features such as the ability to upload images of birds and to identify them using image recognition.

View the application Download or View Source Code (GitHub)

Acknowledgements

I would like to thank my team member, for her help and support throughout the project.