Natural Language Processing Specialization

Overview

Natural Language Processing Specialization


WHAT I LEARNED


  • Use logistic regression, naïve Bayes, and word vectors to implement sentiment analysis, complete analogies & translate words.

  • Use dynamic programming, hidden Markov models, and word embeddings to implement autocorrect, autocomplete & identify part-of-speech tags for words.

  • Use recurrent neural networks, LSTMs, GRUs & Siamese networks in Trax for sentiment analysis, text generation & named entity recognition.

  • Use encoder-decoder, causal, & self-attention to machine translate complete sentences, summarize text, build chatbots & question-answering.

There are 4 Courses in this Specialization


Course 1 - Natural Language Processing with Classification and Vector Spaces

  • In the first course of the Natural Language Processing Specialization

  • I performed sentiment analysis of tweets using logistic regression and then naïve Bayes,

  • I used vector space models to discover relationships between words and used PCA to reduce the dimensionality of the vector space and visualize those relationships, and

  • I wrote a simple English to French translation algorithm using pre-computed word embeddings and locality-sensitive hashing to relate words via approximate k-nearest neighbor search.

Projects


Course 2 - Natural Language Processing with Probabilistic Models

  • In the second course of the Natural Language Processing Specialization

  • I wrote a simple auto-correct algorithm using minimum edit distance and dynamic programming,

  • I applied the Viterbi Algorithm for part-of-speech (POS) tagging, which is vital for computational linguistics,

  • I wrote a better auto-complete algorithm using an N-gram language model, and

  • I wrote my own Word2Vec model that uses a neural network to compute word embeddings using a continuous bag-of-words model.

Projects


Course 3 - Natural Language Processing with Sequence Models

  • In the third course of the Natural Language Processing Specialization

  • I trained a neural network with GLoVe word embeddings to perform sentiment analysis of tweets,

  • I generated synthetic Shakespeare text using a Gated Recurrent Unit (GRU) language model,

  • I trained a recurrent neural network to perform named entity recognition (NER) using LSTMs with linear layers, and

  • I used so-called ‘Siamese’ LSTM models to compare questions in a corpus and identify those that are worded differently but have the same meaning.

Projects


Course 4 - Natural Language Processing with Attention Models

Projects


Disclaimer


  • DeepLearning.AI makes course notes available for educational purposes.
  • Project solutions are just for educational purposes. I highly recommend trying and solving project/program assignments on your own.

All the best 🤘

Owner
Kaan BOKE
Data Scientist
Kaan BOKE
State-of-the-art NLP through transformer models in a modular design and consistent APIs.

Trapper (Transformers wRAPPER) Trapper is an NLP library that aims to make it easier to train transformer based models on downstream tasks. It wraps h

Open Business Software Solutions 42 Sep 21, 2022
A Multi-modal Model Chinese Spell Checker Released on ACL2021.

ReaLiSe ReaLiSe is a multi-modal Chinese spell checking model. This the office code for the paper Read, Listen, and See: Leveraging Multimodal Informa

DaDa 106 Dec 29, 2022
NLP-Project - Used an API to scrape 2000 reddit posts, then used NLP analysis and created a classification model to mixed succcess

Project 3: Web APIs & NLP Problem Statement How do r/Libertarian and r/Neoliberal differ on Biden post-inaguration? The goal of the project is to see

Adam Muhammad Klesc 2 Mar 29, 2022
Kurumi ChatBot

KurumiChatBot Just another Telegram AI chat bot written in Python using Pyrogram. A public running instance can be found on telegram as @TokisakiChatB

Yoga Pranata 3 Jun 28, 2022
Estimation of the CEFR complexity score of a given word, sentence or text.

NLP-Swedish … allows to estimate CEFR (Common European Framework of References) complexity score of a given word, sentence or text. CEFR scores come f

3 Apr 30, 2022
MPNet: Masked and Permuted Pre-training for Language Understanding

MPNet MPNet: Masked and Permuted Pre-training for Language Understanding, by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu, is a novel pre-tr

Microsoft 228 Nov 21, 2022
aMLP Transformer Model for Japanese

aMLP-japanese Japanese aMLP Pretrained Model aMLPとは、Liu, Daiらが提案する、Transformerモデルです。 ざっくりというと、BERTの代わりに使えて、より性能の良いモデルです。 詳しい解説は、こちらの記事などを参考にしてください。 この

tanreinama 13 Aug 11, 2022
Wikipedia-Utils: Preprocessing Wikipedia Texts for NLP

Wikipedia-Utils: Preprocessing Wikipedia Texts for NLP This repository maintains some utility scripts for retrieving and preprocessing Wikipedia text

Masatoshi Suzuki 44 Oct 19, 2022
Predict the spans of toxic posts that were responsible for the toxic label of the posts

toxic-spans-detection An attempt at the SemEval 2021 Task 5: Toxic Spans Detection. The Toxic Spans Detection task of SemEval2021 required participant

Ilias Antonopoulos 3 Jul 24, 2022
The entmax mapping and its loss, a family of sparse softmax alternatives.

entmax This package provides a pytorch implementation of entmax and entmax losses: a sparse family of probability mappings and corresponding loss func

DeepSPIN 330 Dec 22, 2022
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language This repository contains UA-GEC data and an accompanying Python lib

Grammarly 227 Jan 02, 2023
Resources for "Natural Language Processing" Coursera course.

Natural Language Processing course resources This github contains practical assignments for Natural Language Processing course by Higher School of Eco

Advanced Machine Learning specialisation by HSE 1.1k Jan 01, 2023
Python powered crossword generator with database with 20k+ polish words

crossword_generator Generate simple crossword puzzle from words and definitions fetched from krzyżowki.edu.pl endpoints -/ string:word - returns js

0 Jan 04, 2022
Module for automatic summarization of text documents and HTML pages.

Automatic text summarizer Simple library and command line utility for extracting summary from HTML pages or plain texts. The package also contains sim

Mišo Belica 3k Jan 08, 2023
Guide to using pre-trained large language models of source code

Large Models of Source Code I occasionally train and publicly release large neural language models on programs, including PolyCoder. Here, I describe

Vincent Hellendoorn 947 Dec 28, 2022
Write Python in Urdu - اردو میں کوڈ لکھیں

UrduPython Write simple Python in Urdu. How to Use Write Urdu code in سامپل۔پے The mappings are as following: "۔": ".", "،":

Saad A. Bazaz 26 Nov 27, 2022
Text editor on python to convert english text to malayalam(Romanization/Transiteration).

Manglish Text Editor This is a simple transiteration (romanization ) program which is used to convert manglish to malayalam (converts njaan to ഞാൻ ).

Merin Rose Tom 1 May 11, 2022
English loanwords in the world's languages

Wiktionary as CLDF Content cldf1 and cldf2 contain cldf-conform data sets with a total of 2 377 756 entries about the vocabulary of all 1403 languages

Viktor Martinović 3 Jan 14, 2022
Open-source offline translation library written in Python. Uses OpenNMT for translations

Open source neural machine translation in Python. Designed to be used either as a Python library or desktop application. Uses OpenNMT for translations and PyQt for GUI.

Argos Open Tech 1.6k Jan 01, 2023
Code for "Generative adversarial networks for reconstructing natural images from brain activity".

Reconstruct handwritten characters from brains using GANs Example code for the paper "Generative adversarial networks for reconstructing natural image

K. Seeliger 2 May 17, 2022