6 Based on ME for Kazakh phrase Identification The Kazakh phrase recognition task is that x represents the environmental context words to be
marked and y is the output. Achieve task: the instance or context condition x, construct a model can
accurately estimate the category marker appears the result y probability, as: p(y/x)。
Model input: Labeled training data from the training sample set extracting T = {(x
1
, y
1
), (x
2
, y
2
), ......, (x
n
, y
n
)},
(x
i,
y
i
) that appear in the corpus when it yi context information for the x
i.
Feature function in that f is between x and y refers to a particular relationship exists, a binary
function that:
F(a,b)= { 1 If x, y condition
0 otherwise
The entropy model P:
y x y x y x p p H ,
)
,
log(
)
,
(
)
(
Maximum Entropy Model:Such a model can be shown to have the following form:
)
(
max
arg
*
p H p C p
Goal: select a distribution p from a set of allowed distributions that maximizes H(Y|X).
compute:
i i y x f x Z x y p )
,
(
exp
)
(
1
)
|
(
*
52
y i i y x f x Z )
,
(
exp
)
(
Where the
i
are the model parameters and the f are the features of the model.