Other projects

Image by Alexandre Debiève

Recent projects:

DispatchPi : a pair of e-paper frames to easily exchange pictures via the Gmail API (2022-3)

Code is here and Reddit post heremark>

My partner is working in Northern Quebec and I was looking for a way to swap pictures on a regular basis. The internet connection up there is slow and unreliable. Energy is consumption is also important, as electricity in Nunavik communities is generated by burning diesel. I built a pair of communicating picture frames powered by Raspberry Pi Zeros with e-ink screens. Each screen has a resolution of 800x600, and relies on the same kind of technology used on e-readers. These screens continue displaying an image even when powered off, as the e-ink involves physical pigment particules that are rearranged by jolts of current. Each frame’s job is to fetch an image file at regular intervals from a specific URL. At this URL, there is a Flask website hosted on Google Cloud Run, that pulls the latest image received as attachment in a Gmail account, with the help of the Gmail API and OAuth. Figuring out auth tokens was a bit of a nightmare, but everything works!

Classroom social distancing optimizer (2021)

Repo link here and MVP demo here

Language & Packages: Python (mostly NumPy, Pygame)

This is a heuristic algorithm that optimizes the capacity of a fixed-seat room with respect to social distancing. It ingests a set of 2D chair coordinates and numbers, and outputs a list of chairs to be occupied. A GUI helps users import files, set their parameters and then observe the outcome. Data can then be exported from the app. Choosing the optimal number of chairs is an optimization problem. To focus on speed and simplicity, our heuristic strategy is that we loop a great number of times over a given algorithm and improve the initial solution only when the capacity increases. This means we are pretty sure to approximate the best possible result, especially in smaller rooms.

There are a few parameters to set, including :

  • The social distance to set between each chair (in metres)
  • Whether you want to divide the room into independent sub-groups
  • The number of iterations without improvements the algorithm should run for
  • The maximum time the algorithm should run for (in minutes)
  • The research strategy
    • The first chair is usually chosen randomly. Its neighbours that become inelegible are removed from play, and different research strategies are implemented to occupy the next chair.

This group project was assembled for my class on algorithms for optimization and big data. Contributors:

  • Mahnaz Gol
  • Emanuel Senay-Lussier

Hike Finder (2021)

Live version here

Language & Packages: R (mostly Shiny, googlesheets4, googleway, dplyr)

This is a proof of concept built in R and Shiny. It allows to choose a hiking trail in Quebec based on certain conditions, including driving distance.

The biggest selling point of this tool is that it calculates your driving distance to all trails and lists the closest ones that match your filters. The Google autocomplete and geocoding APIs are used to standardize the address field and then transform it into a set of coordinates. The googleway package is then used to inferface with the Google Maps API, mostly to calculate driving distance.

Authentification is done through a .json file and an API key integrated in the main R file.

The code includes a few secret API keys and so is not immediately shareable, but I’m happy to share it on request.


Book recommender system (WIP - 2021)

Language & Packages: Python

This project suggests new books to a user based on their previous ratings.

We delved into the Goobooks-10k dataset, that includes 6M individual reviews and 10 000 different books. The data was merged with additional tags present in the Best books ever dataset, that contains 50 000 books. Missing descriptions for 1800 books were then scraped through the Goodreads API, which bizarrely enough still works although it was announced in December 2020 it would be discontinued. The cleaned and agregated dataset will be released for public use soon.

A few strategies were applied to build efficient recommendations, including linear regression, content and user-based filtering strategies, and neural networks.

Collaborators :

  • Yifan Yin
  • Jingyi Zou

Incubator stats : a simple dashboard for incubators (2021)

Live version here

Language & Packages: R (Shiny, ggplot2, dplyr)

Link to repo: https://github.com/malcolmosh/incubator_stats

This dashboard displays a few simple metrics and graphs in a readable interface, with filtering options. The demo is available here. Artificial data was generated for this project and is stored in a Google Sheets here.

Basically, this was an interesting foray into using Google Sheets as a flexible data source. Multiple graphs are generated with ggplot2.

Note that the interface of this personal project is in French.


Tangerinr : a R package for analysing banking data (2021)

Language & Packages: R (tidyverse, ggplot2)

Link to repo: https://github.com/malcolmosh/tangerinr

This is a functional but still limited R package that can be installed directly from Github. It can import debit statements from Tangerine Bank (Canada) located in a local folder, aggregate them, and then produce some simple visualizations.

This project was completed for my class on statistical software.