Ideas: Handwriting recognition system and input method, Decentralized Knowledge systems, Language computing related
Listing some project ideas here:
Handwriting recognition system and input method
- Port the free and opensource handwriting recognition system at https://handwriting.smc.org.in/ to Java or a language suitable to integrate with Indic keyboard.
- While working on this, it is expected to help improve the algorithm further for better accuracy by adding a postprocessing module to eliminate candidates and rearrange best candidates
- Add support for more scripts
While working on this, the participant will learn and practice:
- UI/UX considerations for a live handwriting recognition based input method
- Frontend programming using javascript and Android programming to add system level input methods
- Characteristics of Indian language scripts - unicode, visual representation, data representation, their conversions
- Predictive entry systems
- Curve matching and procrustes analysis
- SVG standard and curve path representation in svg
- Programming languages: javascript, python, java or C
- Ability to experiment and self learn technology for solving a problem with very little help from mentor(!), but learning the skill of effective communication in community to get things done
A significant progress in this project means
- First successfull attempt on a functional handwriting based input method, that is not proprietary, works offline and respects your privacy
- Reduce the barrier to use Indic languages in mobiles
Decentralized Knowledge systems starting with wikipedia
To continue with the experiments at https://github.com/santhoshtr/wikipedia-ipfs for people interested in learning, practicing web 3.0 and p2p web. This field is mostly open for explorations and innovations than a fixed problem and solution. Being able to successfully access wikipedia from p2p web by forming a sufficiently large p2p nodes in the geography is a good step forward.
While working on this, the participant will learn and practice
- Understanding of concepts of p2p, decentralized networks, Interplantery Linked Data(IPLD), IPFS, IPNS, DHT, WebRTC.
- Handsown experience on building p2p networks and nodes
- Basic understanding of linked data in p2p
- Frontend programming
- Programming languages: javascript, go
- Ability to experiment and self learn technology for solving a problem with very little help from mentor(!), but learning the skill of effective communication in community to get things done
Malayalam related
Some of these can be ported to other languages too.
- Integrate the https://morph.smc.org.in/spellcheck system as a spellchecker for libreoffice(Refer: https://thottingal.in/blog/2019/03/10/libreoffice-malayalam-spellchecker-using-mlmorph/)
- Develop a named entity recognition system based on mlmorph(refer https://gitlab.com/smc/mlmorph-ner)
- Enhance https://gitlab.com/smc/mlmorph
- Make https://gitlab.com/smc/mlmorph-spellchecker work with with KDE or GNOME system
- Convert a Malayalam font like Chilanka to a font with drawing direction indicators to help learning writing a language(Refer https://i.imgur.com/EBVtDOO.png)
- Help maintaining SMC fonts by fixing reported bugs(Participant will learn typography and opentype engineering on the way if interested)