In this exercice, we gonna go thru an algorithm to determine the convex Hull to a set of X-Y points
Pathfinding - Dijkstra Algorithm
In this exercice, we gonna explore the well know pathfinding algorithm called Dijkstra
The exercice on this notebook is to create a train the elementary element of a Neural Network called Perceptron
Travelling Salesman Problem
In this workbook, the objective is to create a Genetic Algorithm to find a good solution to the well known NP-Hard problem called Travelling Salesman Problem
Tridiagonal Random Matrice
This workbook has been created to help someone in OpenClassRoom's forum. The idea is to observe the change on Eigenvectors linked to noises in a Tridiagonal Matrix
ML - Regression - Auto MPG Dataset
The objective in this workbook is to create a model of car's consumption based on some datas. Datas are available here
ML - Classification - Iris Dataset
In this workbook, we are going to explore the well known Iris dataset and fit a Forest Tree classifier to visualize how it works
ML - Tensorflow - Logistic Classification - Moons Dataset
In this exercice, the objective is to create a (linear and polynomial) Logistic Classifier on Moons dataset using Tensorflow
ML - Tensorflow - Exploration of Neural Network
In this workbook, the objective is to see the impact of some parameters of a Neural Network or Features on the training time/speed
ML - Scikit Learn - House rent prices
This workbook set up a model to determine the rent based on the surface and district. This exercice is based on an exercice from OpenClassRoom
ML - Scikit Learn - Kaggle - Credit Card Fraud
In this workbook, we will explore a very unbalanced dataset about credit card fraud found on Kaggle. On this dataset, we will discover the advantage to perform the PCA on an undersampling dataset to simplify the model.
ML - Scikit Learn / Keras - Kaggle - Human Resources Analytics
In this workbook, we will explore a dataset regarding reasons to leave a company found on Kaggle.
ML - Keras - Introduction to RNNs with Reber's grammar
This workbook introduces Recurrent Neural Network highly used for "temporals"'datas. In this introduction we will test some model based on reber's grammar.
ML - FBProphet - Predict delay of flights
During the Master of Data Sciences, we had to predict lateness of flight based on date/hour/company/airports/... During this one, I tried multiple models but in this notebook we will discover the librairy called FBProphet used to find pattern in timeseries.
ML - Scikit Learn - Test of UMAP on clustering of articles
UMAP is a new Algorithm of dimension's reduction found on Arxiv. To try it, an improvement has been tried on the clustering of articles from the Project 5 of the Master of Data Science. All articles will be converted to a TF-IDF matrix. Then the Latent Semantic Analysis will be performed on this matrix to have a dense result and a Kmeans will be applied. For the visualisation, the TSNE will be compared with the UMAP.
ML - Keras - Simulation of double pendulum using RNNs
After creating a simulator of the double pendulum, a dataset will be created in order to try to train a RNN on it. By using few LSTM, a correct simlution of this chaotic system has been reached.
ML - Keras - Simulation of double pendulum using RNNs (part 2)
Previously, we tried to simulate a double pendulum using RNN. To do so, we used fixed model parameters (length and mass). In this workbook we will try to extend this model to simulate a model with also variable parameters and wider initial parameters
ML - Keras - Classification Dataset QuickDraw with 20 similar classes
In this notebook, differents CNNs will be tested on a dataset of images more complex than MNIST still made by hand (20 classes, animals only, 28x28 in B&W). Classic CNNs will be tested as well as the CapsNet before looking more in detail at the bad classifications. We will see that these are often errors related to the designer...
ML - Numpy - Test of Neuro-Evolution instead of Reinforcement Learning
In this notebook, I tried to train a Neural Network to solve the CartPole problem of gym library using Genetic Algorithm isntead of Reinforcement Learning with Back Propagation. After Analysis of the behavior, a linear Model has been trained to perform nearly as well as NN
DA - R - Exploration of Codingame Leaderboard
Being learning R, I used a webscrapped version of the leaderboard of Codingame website in order to apply multiple learning done during classes (working on a dataset, ploting with ggplot and reporting with R markdown). In this Notebook, we will explore the leaderborad and the balance of player. In addition, we will try to find trick to climb fast in the ranking.
ML - Tensorflow - DeepDream video
Based on a playlist from Sentdex, his code has been reused and slightly modified in order to generate a DeepDream video using "inception"'s model. There is no Notebbok for this project but a README is availabel on Github quite complete.
ML - Keras - Audio Classification
As we can do with MNIST, this Notebbok explore audio file in an unsupervised way then try to perform a classification on it (with simple ANNs, CNNs and RNNs)
ML - Keras - Audio Classification FMA
This week, we will explore another more complete audio dataset which is FMA dataset. As I did previously, I'll try to create a classifier of music type. We will discover that the success on the previous model was only due to overfitting and this doesn't work as good as expected in addition of having other drawbacks.
ML - Keras - Audio Segmentation FMA
After testing the audio classification based on pattern recognition on the FFT signal, a new approach was tested using segmentation and clustering . This method exploits the data similarly as text data and has the advantage of being compatible with any length of audio sample.
ML - Keras - Audio Classification FMA using meta data
After testing the audio classification based on pattern recognition on the FFT signal and the audio clustering, I wanted to try classification using meta data extracted from audio. The part 1 is a try a the small balanced dataset with only 1 genre per song and the second one is a try on the full unbalanced multilabel dataset but without good results.
ML - Kaggle Comp. - Free Sound Audio Tagging
To finish with the audio processing, I decided to work on another dataset from a Kaggle Competition to classify audio too. Compare to FMA dataset, this one is slightly unbalanced in term of class but also very different in term of duration. To do the classification, we will extract the same features as FMA and add the duration before to classify it. The result is not perfect but reach still 74% of accuracy.
ML - Numpy + Scipy - Regression on circular datas
At work I faced a question on how to find the best circle fitting a set of point measured in (X, Y) coordinate. This gaves me the idea to take a deeper look at this question and find algorithms (mainly using Scipy), test them and implement them in a syntax close to sklearn library
ML - NoSLQ + sklearn + nltk - Sentiments Analysis on Tweets
A new project I'm starting uses sentiments analysis on tweets. For now, I've uploaded a Notebook regarding the query of tweets and storage in a NoSQL database and a second one where I tried to set a simple model to predict sentiment of a tweet.
Data Analysis - Sentiments Analysis on Tweets
On the previous script, we set-up an API to query tweets and do the sentiment analysis. This script ran during the match France vs Argentina and this notebook is a Data Analysis of the sentiment during the match
AI - Solver of Taquin
Slightly different subject today. We will try to create a simple solver of Taquin using graphs algorithms. A BFS, DFS and A* will be created in order to find a solution to 3x3 Taquin (not 4x4 yet as it is highly more complex)
DA - Python - Exploration of Codingame Competition's LearderBoard
After a discussion regarding programming languages, I wanted to take a look at most used programming language for IA in general. After extracting datas from Competition's LearderBoard, we will explore results and try to understand the choice.
IA - Java - Genetic algorithm and PID
This week I made a project where I wanted to create in Java (using processing) as small genetic algorithme able to "drive" a solid between doors like with flappy bird. To have a fast learning, I provide a trajectory to follow and the genome is the parameters of the PID and the speed along x-axis.
ML - Keras - Heartbeat Classification
In this Notebook, the goal was to redo the model published in the paper ArXviv 1805.00794. We will also see that even if the result is very good, there is a very strong overfitting making the model impossible to reuse on others ECGs.
GA - Java - Code tester
This small project is a comparator of 4 algorithms to find a code based only on the number of correct digits. The first one is trying digits one by one for each position to find the correct one. The second one is a genetic algorithm with the fitness based on number of correct digits. The third one is a greedy algorithm and the last one is a random generator.
Algorithm - Java - Fractal
To continue with "Processing" in Java. I wanted to learn how to use a bit more the mouse. To do so, I create a fractal renderer. With a left-click, you can change the Complex Number used in the series. With right-click, you can span the image and with mouse wheel, you can zoom in/out.
Android + Java - Bluetooth communication
For this second project, I wanted to learn a bit of Android. The idea was to be able to create a simple app which takes the values of sensors of the phone and send it to a script running in Java on my computer. At the beginning, I wanted to be able to create a small game contolled by sensors but there is too much noise. But at least I succeed to understand how an Android app is working and how to use the bluetooth (which was my main goal).
ML - FBProphet - Exploration of Air Quality in Madrid
This week, I decided to reuse a new time the library FBProphet on another dataset with more data than the plane delay which was only on 1 year. This dataset is lighter but have 18years of records about the Airquality in Madrid (taken every hour in 24 points). This dataset is more suitable for Time Models like this model. In this we will discover the impact of human activity on pollution.
AI - Tensorflow / Gym - Presentation of the Q-Learning
In this Notebook, we will explore the most simple algorithm to learn a policy. It's called Q-Learning. To do so, we will start with the simplest example which is the search of the shortest path in a graph. Then, we will use gym to try it with few puzzle having different challenges (sparse reward, large space-state, continuous input/output). This example will lead to future algorithms in the future.
ML - Keras/Sklearn/Gensim - Recipes classification
This Notebook is an extension of the work done to participate to Kaggle's competition "What's Cooking". The objective is to classificy the origin of a recipe based on ingredients. In this one, multiple pre-processing will be tried and few models with corrects results.
IA - Keras/Gym - Introduction to Deep-Q Network
In a previous script, we saw the restrictions that were reached with Q-Learning. Another approach proposed by DeepMind in 2013 was to give the task of evaluating the Q-value to a Neural Network. Deep-Q Network was born. In this notebook, we will see the principle and create our first DQN
Libs - Python - Presentation of 2 libs
Being stuck in a futur project, I'm posting less and less projects. This week I wanted to share 2 libraries I found and wanted to try for a while. There is TPOT which uses Genetic Algorithms to optimise the pre-processing steps and Black which is a code formatter for python file.
Simulation - PyCUDA/Scipy - Comparison GPU vs CPU for Conway's Game of Life
Following a discussion I got on a forum, I wanted to have a look at performances of GPU for non-matrix computation. I decided to do the trial on the Game of Life proposed by Conway. In this notebook, we will compare performances of 3 different approaches.
IA - Keras/Gym - Policy Gradient on Cartpole
Another approach compared to Deep-Q Network, which predicts the expected futur reward of taking an action at a given state, is to predict directly the action. This is what is proposed with Policy Gradient model and we will discover it in this notebook.
IA - Keras/Gym - Proximal Policy Optimization on Cartpole
For few years now, Policy Gradient became more or less obsolete. Another algorithm a bit more complex (but more performant) was published by OpenAI in 2017 which is the Proximal Policy Optimization. Due to it's simplicity, it also replaced in several cases the DeepQ Network. In this notebook, we will set-up this model on Cartpole again.
Simulation of Momentum Trading - Python - Data Preparation
In Finance, an optimization technique of a Portfolio is the Momentum Trading. This method has been studied for a long time and shows performances above classics investments proposed by banks. In a series of Notebooks, the goal will be to make a simulator on past performances of ETFs and check if this is still true in 2018.
Simulation of Momentum Trading - Python - Simple Simulation
Previously, we pre-processed the history of ETFs. This notebook will do a first simple model taking into consideration the performance to estimate the possible benefit of this method. The idea is to have a proof of concept.
Simulation of Momentum Trading - Python - Benchmark
In the previous notebook, a simple model was made to check that this method is more or less working. The next step is to run simulations with multiple parameters to see "how safe" this method is
Simulation of Momentum Trading - Python - LSTM
To try to improve/stabilize the decision about ETFs, we can try to train a Reccurent Neural Network with last N-weeks of data to predict the futur performance. This is what I tried to do on this last notebook.
Web-Scrapping - Python - Birthday Bot
Following a video of Micode, I wanted to do a bot on twitter. For this project, celebrities's birthday dates were extracted from a website and a bot will be set up to wish on daily basis an happy birthday to celebrities on Twitter
ML - Python - Pose Estimation
Compared to previous scripts, this one doesn't have any Notebooks as I only created a pipeline to process a video trhough a Pose Estimation Model called DensePose. This project has been done on the video clip called "Skibidi" as there is challenges for the model (more or less persones, costumes, fast moves, ...)
ML - Python - Bayesian Networks
On a previous project, I discovered what are Bayesian Networks. To understand a bit more, I decided to do few exercices about that. In this Notebook, I dig a bit into "pgmpy" library with an example and an exercice. After that, I took the opportunity to do a word generator using Markov Model.
Simulation - Python - Multiple knapsack problem
Let's assume we want to assemble a stack of parts. How can we minimize the variation of a specific criteria along all the produced part ? This is what I wanted to look at in this Notebook isung Genetic Algorithms.
Reinforcement Learning - Python/Tensorflow - Minesweeper Bot
After learning DQN and a bit of PPO, I wanted to apply it on a medium scale project. The minesweeper appeared good. Howerver I realized that it's not the case because the environment is partially observable and stochastic. As a result, I just trained 2 models :
-The first one should predict the probability of a bomb given the neighborhood (5x5 grid)
-The scond one in the full grid.
Unfortunately the result of the full grid is not very good and the other one is not very convenient...
Optimization - Python - Filters decomposition
In image processing, a common step is the application of convolutionnal filters. Based on the size of the filter, the time required to process an image can takes time. In this Notebook, we will see how this can be improved a lot by using filter decomposition
Optimization - Python - The Centrifuge Problem
In this last weeks, I made some Codingame puzzles and I found on youtube a nice problem I decided to solve. The solution is only a py file as there is not a lot to explain. The solution is presented on the attached video.
Optimization - Python - Sudoku Solver
I found on Kaggle a Dataset of 1 million Sudoku. I decided to download it to check if I could solve them but also check what makes them difficult to solve (for an algorithm). 3 algorithms will be compared. To finish, the top-4 most complex sudoku will be tested on these algos
NetworkX - Python - Facebook Network
Previously, I discovered this video by Michael Launay explaining a funny fact. In most of cases, you will have the impression of having less friends than them on Facebook. So I decided to use my previous learning on NetworkX to check this out.
Cryptography - Python - Quantum Toss Coin
On youtube, Physics Girl presented a way to play at distance the game of the Toss Coin. This is using light and quantum physics. This gave me the idea to start learning a bit cryptography and this is the first notebook of this subject.
Cryptography - Python - RSA
On the domain of Cryptography, one of the most common Asymmetrical Key Encryption is the RSA. In this Notebook, we will dig into this algorithm (how it works and a simple example)
Cryptography - Python - Elliptical Curve Cryptographie
While starting tolearn Cryptography, I discovered the principle of Elliptical Curve Cryptography. This amazed me with the principle of being able to use the Fast exponentiation principle on geometries. In this Notebook, we will go thru the principle of Diffie-Hellman Key Exchange but also the implementation of ECC to generate keys.
Cryptography - Python - Block Cypher encryption
Previously, we discovered how to create symmetrical and asymmetrical keys. This was used with very basic encryption methods afterward. In this notebook, we will explore how to make encryption on any kind of data. There is multiple solutions existing but we will only test the 2 safest modes (CBC and CTR) with the 2 safest symmetrical encryption algorithms (AES and Triple DES).
Cryptography - Python - Discrete Log Attack
In the first Notebooks about Cryptography, we speak about the Key creation and says that they need to be very long. In this Notebook, we will see how to attack a Public Key to find the Private Key and being able to decrypt messages.
Image Preprocessing - Python - Seam Carving
During a Codingame puzzle, I discovered the Seam Carving algorithm that could be a very good solution for pre-processing images in a machine learning project. It would help to resize images without stretching them as is the case in most cases
Image Classification - Keras - Class Activation Map
During a previous competition Kaggle, I discovered the principle of Class Activation Map (CAM) to visualize which areas the model uses to make the prediction. After the discovery related to impact of texture on classification, I wanted to use this method on this image (and others) in one Notebook.
Image Classification - Keras - Seam Carving Preprocessing
Previously we saw a technic used to resize image without losing area of interest and also visualize which part of an image is used to make the classification. In this Notebook, we will test to use Seam Carving on large images instead of classic resizing and see the impact on the classification.
Image Classification - Keras - Seam Carving on Stanford dogs dataset
In the previous Notebook, we saw the benefit of the Seam Carving on the classification of an image with high Width/Height ratios. Because of this, I wanted to test the benefit of this method on a complete dataset. I took the Stanford dog breeds dataset to do this trial.
Simulation - p5.js - Ray tracing using Ray Marching
Few times ago, I wanted to learn a bit about Ray tracing. I discovered the technic called Ray Marching to find the intersection of a ray on a surface. The principle being interesting, I wanted to make my own implementation with p5.js. I also proposed this as a problem in Codingame.
Visualisation - leaflet.js - Visualization of car accident in France
On the government website, we have access to all records reported by the Police about car accidents. After a simple processing in Python, I took the opportunity of having this data to learn a bit about leaflet.js by doing a visualization of accidents by year and localization.