0% found this document useful (0 votes)
45 views8 pages

Syntactic Parsing and CKY Algorithm Guide

Syntactic Parsing involves analyzing a sentence's grammatical structure to produce a parse tree, with the CKY Algorithm being a key method for parsing using Context-Free Grammar in Chomsky Normal Form. Statistical Parsing assigns probabilities to different parse trees to select the most likely interpretation, and Probabilistic Context Free Grammar (PCFG) incorporates probabilities into CFG rules. Probabilistic CKY Parsing builds on the CKY Algorithm by storing probabilities, allowing for the selection of the most probable parse tree.

Uploaded by

krishna Jyothi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views8 pages

Syntactic Parsing and CKY Algorithm Guide

Syntactic Parsing involves analyzing a sentence's grammatical structure to produce a parse tree, with the CKY Algorithm being a key method for parsing using Context-Free Grammar in Chomsky Normal Form. Statistical Parsing assigns probabilities to different parse trees to select the most likely interpretation, and Probabilistic Context Free Grammar (PCFG) incorporates probabilities into CFG rules. Probabilistic CKY Parsing builds on the CKY Algorithm by storing probabilities, allowing for the selection of the most probable parse tree.

Uploaded by

krishna Jyothi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

1.

SYNTACTIC PARSING
What is Syntactic Parsing?

Syntactic Parsing (also called parsing) is the process of:

Taking a sentence and analyzing its grammatical structure using a grammar.

It tells:

✔ which words group together


✔ what their roles are
✔ the structure of the sentence (parse tree)

Example Sentence

“The boy eats an apple.”

A parser finds:

 “The boy” → Noun phrase (NP)


 “eats an apple” → Verb phrase (VP)
 Entire sentence → Sentence (S)

Output is usually a parse tree.

Why Parsing is Important?

 Machine Translation
 Question Answering
 Information Extraction
 Grammar Checking
 Sentiment Analysis
2. COCKE–YOUNGER–KASAMI (CKY)
PARSING ALGORITHM
The CKY Algorithm is one of the most important algorithms in syntactic parsing.

2.1 What is CKY Algorithm? (Very Simple)


CKY is a bottom-up parsing algorithm used to parse sentences using:

 Context-Free Grammar (CFG)


 In Chomsky Normal Form (CNF)

It builds a dynamic programming table to find all possible parses.

2.2 Requirements of CKY


CKY works only with:

✔ CFG
✔ in CNF (Chomsky Normal Form)

CNF rules are of the form:

1. A → B C
2. A → a

No rule can have more than 2 symbols on the right side.

2.3 CKY Parsing Table (Easy Explanation)


CKY uses a triangular table (matrix).
If sentence length = n, table size = n × n.

Each cell stores the non-terminals that can generate that substring.
2.4 CKY Algorithm Steps (Very Easy)
Step 1: Convert grammar to CNF

Example grammar:

S → NP VP
NP → Det N
VP → V NP
Det → “the”
N → “boy”
V → “saw”

Step 2: Fill diagonal cells (words of the sentence)

Sentence: “the boy saw”

Fill entries:

 Cell(1,1) ← “the” → Det


 Cell(2,2) ← “boy” → N
 Cell(3,3) ← “saw” → V

Step 3: Fill upper cells bottom-up

Try all splits:

Example:
Substring (1,2): “the boy”

Check combinations:
Det + N → NP
So cell(1,2) = NP

Continue until top cell gives S.

Step 4: Accept the sentence if S is in table(1,n)


2.5 Advantages of CKY
 Efficient dynamic programming
 Finds all possible parses
 Works well with PCFG (probabilistic version)

3. STATISTICAL PARSING BASICS


What is Statistical Parsing?

Statistical Parsing assigns probabilities to different parse trees and chooses the most likely one.

Why Statistical Parsing?

Grammar alone may produce many valid trees.

Example:

“I saw the man with a telescope.”

2 possible meanings → 2 parse trees.


Statistical parsing chooses the most probable interpretation.

Core Idea

Attach probabilities to:

 Rules
 Parses
 Trees

Then use algorithms to compute the best parse.


4. PROBABILISTIC CONTEXT FREE
GRAMMAR (PCFG)
What is a PCFG?

A PCFG = CFG + Probability for each rule.

Each grammar rule has a probability:

Example:

NP → Det N 0.6
NP → NP PP 0.4

4.1 PCFG Rule Probability

Probability =
Number of times rule appears /
Total number of expansions of that non-terminal

4.2 Probability of a Parse Tree

Multiply the probabilities of all rules used in building the tree.

[
P(parse\ tree) = \prod P(rule_i)
]

The most likely parse tree = tree with highest probability.

4.3 Advantages of PCFG

 Handles ambiguity
 Gives quantitative ranking of parse trees
 Used widely in early statistical NLP
5. PROBABILISTIC CKY PARSING OF
PCFGs
This is the probabilistic version of the CKY algorithm.

✔ Same algorithm structure


✔ But instead of storing only symbols
✔ We store probabilities also
✔ Choose the best parse tree using maximum probability

5.1 How Probabilistic CKY Works?


Step 1: For diagonal cells

For word “boy”:

N → boy P = 0.5
NP → boy P = 0.1

We store:

Cell(2,2):

 N : 0.5
 NP : 0.1

Step 2: For each upper cell

Combine possibilities:

Example:

Cell(1,2) from “the boy”:

Det (0.7) + N (0.5) → NP with rule probability 0.6


Total probability =
0.7 × 0.5 × 0.6 = 0.21

Store NP with probability 0.21.

Step 3: Continue filling table

For all spans and all splits:

[
P(A\rightarrow BC)=P(A\rightarrow BC) \times P(B) \times P(C)
]

Pick max probability if more than one parse exists.

Step 4: Final parse

Top cell gives S with highest probability.

5.2 Advantages of Probabilistic CKY


 Combines dynamic programming with probabilities
 Finds the most probable parse, not just any parse
 Works well with PCFG
 Efficient and accurate

6. SHORT SUMMARY
1. Syntactic Parsing analyzes sentence grammatical structure and produces a parse tree.
2. CKY Algorithm is a bottom-up dynamic programming parser that works with CFG in
CNF form.
3. Statistical Parsing assigns probabilities to parse trees and selects the most likely one.
4. PCFG is a CFG where each production rule has a probability.
5. Probabilistic CKY Parsing extends CKY to PCFG by storing probabilities and choosing
the best parse tree using maximum likelihood.

You might also like