data analysis on tablet

This is a project in which I create a database backed flask application which provides analysis of a football team and provide reasonable transfer targets for the team and also analysis of players in the FIFA 20 game using a personally created database using a dataset from Kaggle and personally web scraping dataset from the sofifa website.

FIFA 20 is a football video game and part of the long standing FIFA video game series. One feature of the game is a career mode where you can take control of a football team as a manager. With this you get a budget and you can shape the team in your image and attempt to improve your club and try win both domestic and continental trophies. One common issue with this mode is the scouting. For example, when playing as a league 1 side and your scout are recommending you buy a 50 million pound player when you have a budget of 1 million. This can be a bit frustrating as you either have to set up a network to get more realistic players in your price range or spend time searching the game database for players that you want and can afford. So, with that in mind I decided to create a flask app which would analysis a team and allow you to see and work out the strength and weakness of the selected team. In addition to the analysis also recommend player or players within the team price’s range which will help to improve the team. I have also added an additional feature which analyses specific player that the user want to look at provided they are in the database and compare them against some of the best players in similar position.

Tools and environments

  • Python 2
  • MySQL
  • Numpy
  • Matplotlib
  • BeautifulSoup
  • Flask
  • requests
  • urllib2
  • Re
  • Sqlalchemy
  • Plotly
  • mysql.connector
  • pymysql
  • io
  • json
  • base64

Datasets

There are two datasets used for the dataset (finalteamdata_fifa20,players_20). One based from a dataset in Kaggle here which contain various details about the players which are important such as their in game rating, transfer values and their in game stats attributes. This dataset was chosen rather than extracting as the data was already present, so I felt no need to extract the data. In addition, this dataset was to act as base to allow for future implementation with updates in the FIFA 20 game and future FIFA games.
The second dataset was a personally created dataset which contain details about the football teams in the games such as the league they play for, budget and prestige. This dataset was selected primarily to help with the data analysis and also help with selecting transfer targets which will be explained in more detail later.

Pre-processing

The players data For the players dataset I had to remove unnecessary values (e.g. skill moves, if the player has a real face and their body type) which did not provide much helpful information for data analysis.

players dataset

Team Dataset

For the team dataset the website sofifa which is a reliable site for FIFA 20 game data was selected. Using beautiful soup to scrape information that would be important for players to know about the team e.g. the overall, team budget and prestige. After filtering unwanted values and cleaning up the dataset.The two dataset were ready to be uploaded to SQL.

teams dataset

Uploading dataset to SQL

The datasets were split into several different dataframe to make it easier for setting up the database and the data analysis later. Pandas is very versatile when manipulating data and it has many useful tools. One of those tools allow us to take a dataframe and insert it into a SQL database. So, using that the two dataset were inserted (after being split into tables) into my empty database fifa20 that I created in MySQL.

insert dataframe to sql

sql schema

In MySQL this schema shown above was created and primary and foreign key were assigned.Missing teams were inserted into the database and values were corrected for spelling. SQL queries were then run for the data analysis.

Data analysis

Team Analysis

For the team analysis the primary goal for most teams will be to do well in their own league e.g. depending on the team they will look to do more domestically and internationally. With that in mind the initial thought was to compare teams against team in the same league to see how they rank in their league. So initially general things like value of their players (more expensive = more quality possibly) were explored to see if there were any trends. The analysis was the split into 4 sections (goalkeepers, defenders, midfielders and attackers) for the selected team and comparing selected attributes relevant to the position for player in selected team among players in the same league and players in the same team.

teams analysis

Transfer recommendations

The transfer recommendations were based on the premises:
• The players improve upon or match the level of the current options at the club
• The club cannot obviously buy players who cost more than their budget
• The players help to rebuild (young players) or add experience (older players) to the club
Also, if a player is a bigger club, they most likely wouldn’t move to club which is considered lower than their current club e.g. Harry Kane wouldn’t move to stoke. But how can we quantify this? In FIFA 20 there is a something called prestige which is a ‘measure ‘of a team reputation among teams in their own country (Domestic) and internationally (International). So, using this along the budget we can then recommend players that not only fit the budget but also are likely to join the club relatively easily. However, with this method it can limit certain players, but this ensure that you have a higher chance of getting the players and add a bit of realism to the scouting process. So, this is more suited for players who want to have a realistic career mode experience.

transfer

Player Analysis

For the player analysis the selected player is compared against the top players in their position (goalkeepers, defenders, midfielders, and attackers) looking at attributes relevant to their position. players

Flask and sqlalchemly

Flask was used to create the site and to use data collected from queries from sqlalchemy and combine it with the data analysis and visualisation and present the results on a page.SQLalchelmy was used to communicate to the database and extract information via querying. The app was then deployed on pythonanywhere.
flask code

The app

Here are some images of the app and some search results

The app provide useful and detailed analysis but there is room for more additions. In the future I will try to add comparison between teams in different leagues. Possibly use a machine learning method to add a predictive value to calculate the likelihood the player would move to the selected club and use that as additional filtering method e.g. add players if their predictive value over 0.5- meaning they are likely to join the club .