Please enable JavaScript.
Coggle requires JavaScript to display documents.
Research Directions (Improve Clustering (Tune DBSCAN (Min number of points…
Research Directions
Improve Clustering
-
Compare algorithms
Shlomi: Some time ago I have compared clustering of Latitude/Longitude coordinates using kmean and Gaussian Mixture model (GMM). See the results here.
Evaluate
find quantitative measure for goodness of cluster? - distance between most distant points in the cluster
-
EDA - new insights
-
Autocorrelation plot
- (bread&butter in TS) - makes sense here?
- What about adding data artificially to make it real TS problem?
-
-
Machine Learning
New models
about the future. xgboost is ok but there are many other models. markovmodels, general additive models, generate features using embeddings from deep learning
-
-
-
-
maps
-
GoogleMaps: Victor's tool, gmaps
geoplotlib: a Python Toolbox for Visualizing
Geographical Data
-
Feature Engineering
-
Features
More places contexts (shop, gym, restaurant, 2nd sleeping-place)
-
-
Distance between events, distance from home/office/gym/supermarket
-
-
-
-
Scalability & efficiency
H2O - autoML
-
why
- scales statistics, machine learning and math over BigData
- is extensible and users can build blocks using simple math legos in the core
- faster and better predictive modeling
- has a vision of online scoring and modeling in a single platform
- automatically builds ensamples with RF, xgboosting
Shlomi
hyperparameters/model tuning? very simple and performance is very good. Just put your features...will apply gridsearch and ensemble and return the best model.
Let Shlomy know - how to install, schedule a meeting. tutorial H2O - understanding. Need computational power or patience to
If you want to test, I'll be happy to help.
-
Sensitivity Analysis
-
explore users correct-incorrect, compare clustering visualization at diferent steps