GSoC/2020/StatusReports/JeanLimaAndrade: Difference between revisions

From KDE Community Wiki
< GSoC‎ | 2020‎ | StatusReports
m (WIP add more info about what was done)
(→‎Community Bonding Period: Add more information about community bonding period)
Line 18: Line 18:
=== Community Bonding Period ===
=== Community Bonding Period ===


I studied how text annotation work, also improved coding techniques, how mark should work, best practices.
I took advantage of this phrase to study how tools of text annotation work to later apply the positive points that I found and correct, and improve, the ones that I didn't like. Text annotation has a bunch of fields, each one meant to a different objective/niche: phrase chunking, named entity recognition (NER), named entity linking (NEL) and so the list goes. For more explanation about this please see my blog post about text annotation.
 
I also deepened my knowledge, studying coding techniques, best practices, and how to make code more efficient and improve readability.


blog posts:
blog posts:

Revision as of 12:21, 1 July 2020

Project Overview

marK is a machine learning dataset annotation tool being developed that will help users annotate multiple types of data for training in supervised classification problems. The objective of the project is to add text annotation support and refactor the codebase to separate image annotation logic from the core of marK, making its codebase more extensible and easy to add new annotation types.

Mentor: Caio Jordão Carvalho

This page is still being written

Milestones

  • Refactor marK codebase, separating the image annotation logic from the core
    • Status: Doing
  • Text Annotation support
    • Status: Pending

Work Report

Community Bonding Period

I took advantage of this phrase to study how tools of text annotation work to later apply the positive points that I found and correct, and improve, the ones that I didn't like. Text annotation has a bunch of fields, each one meant to a different objective/niche: phrase chunking, named entity recognition (NER), named entity linking (NEL) and so the list goes. For more explanation about this please see my blog post about text annotation.

I also deepened my knowledge, studying coding techniques, best practices, and how to make code more efficient and improve readability.

blog posts:

community bonding introduction

a bit about text annotation

Coding Period - First evaluation

In the first coding period I merged pending code to the master branch in !2. This period marK structure changed and so my plans in how to tackle text annotation. See the posts for more explanation.


blog posts:

week 1, 2 and 3

Coding Period - Second evaluation

Coding Period - Third evaluation

Important Links

You can see my proposal here. For posts of GSoC 2020, check my blog.

About me

Name: Jean Lima Andrade

Invent id: jyeno

IRC Nick: jyeno

Telegram Nick: jyeno