This project incorporates job data scraped from Indeed. Using data tools like Python, SQL, HTML/CSS/JavaScript, we are providing visual details about the job market for Catalina Island, California.
Utilizing Jupyter Notebook and Python, we used Beautiful Soup and requests as tools to web scrape the Indeed website. We used the 'inspect' feature on the browser to get an idea of what information could be scraped based on the common and repeated html tags. We began with building the URL and search parameters using requests to connect to indeed.com. Then used BeautifulSoup to provide a cleaner view of the html and identify which level tag needs to be used to gather the other details. To get each value we created individual code and checked to ensure the data pulled was what we wanted, after making adjustments as needed we then put all the code into a for loop along with a page advancer to gather the entire dataset. We then exported the data into a csv format to use in our database.
We utilized SQL (Postgres4) as our database by creating a table to hold our imported csv data. We then reviewed the data and 'normalized' it so it was easier to understand. Locations that were written differently were updated, removed unnecessary words, and changed salary information to yearly salary. Also added a column for a unique ID incase we ever needed to join an additional table and use a primary key to reference.
Now that we had clean data we utilized HTML, JavaScript, CSS, and bootstrap to create a professional looking front-end webpage to hold our data and visualizations.
Our first visualization we wanted to utilize the JavaScript plugin Leaflet to map out the locations of available jobs. As we are only looking at a small geographic areas data, we chose to use the clustergroup marker option that allows a person to see the number of jobs and as they zoom in they separate and become individual markers that show some of the job details when selected.
In order to make all of our data accessible, we utilized a filterable HTML table. In order to create the HTML table we first had to get the data into a JSON format and make it a variable called in our JavaScript. We built the table framework into a new HTML that is tied to our index.html so the data can be viewed online. We then used D3 to grab each value and compile into the HTML table and utilized the CSS file to upgrade its appearance. We then developed a function that allowed the table to be filtered by starting to key a word it will filter among all the columns for the combination of your search value. No need to hit a button or hit enter as it adjusts after each letter is typed and/or deleted from the search.
Heroku API was used to create a space that people could post new jobs. We created a form that ties to a sqlite database and utilizes plotly to map out the postings submitted by users.
Finalizing and making sure all connection function
Based on the data that we found, there were several cities on Catalina Island that had a high number of job opportunities. The best part of working for the island company is that you're stuck on an amazing island. The worst part about working there is that you're stuck on the island. ... However, the island is away from all the hustle and bustle of the mainland. It's relaxed, and most of the stresses of mainland life, don't exist on Catalina.