[SOLVED] BIM207-Homework 1

30.99 $

Category:

Description

Rate this product
  • You are asked to calculate followings:

Average Term Length By Initial Character: For example, If your tokens are [”apple”,”banana”,”avocado”,”blueberry”], then your output should be like

a = 6  b = 7.5

○   Total Minimum Distance: For each term pair, calculate the following formula

f(t1) * f(t2)

 

1+ln d(t1,t2)

 

where f(t)​ is the count of the term t​ in the text and d(t1,t2) gives the minimum distance between t1 and t​2 where t1 is followed by t2. For example, If the text is

 

”aa bb cc aa cc dd bb” and t1 = aa and t2 = bb, then ∑d(t1,t2) = 1+3 = 4. You

 

should print only topN pairs according to the score.

 

Important !

Make sure the following commands are running mvn clean package

java -jar target\bim207hw.jar sampleText.txt 10

 

 

 

 

 

Sample Output

InitialCharacter            AverageLength

1 3.5 2 2.0 3 5.0

5          1.0

7          4.0

  • 285714285714286
  • 0
  • 333333333333333
  • 0 f 6.0 g 7.125 h 5.375 i   6.0

k 9.266666666666667 m 5.857142857142857

o 8.0 p 8.5 r 6.0

s 7.214285714285714 t 6.363636363636363

  • 0
  • 4285714285714284 y 10.0 z  7.5

ç 11.666666666666666 ö 11.090909090909092 ü 12.666666666666666

 

Pair{t1=’yerleşkesindeki’, t2=’ve’, factor=26.0}

Pair{t1=’ve’, t2=’sayılı’, factor=15.356018837890671}

Pair{t1=’tarih’, t2=’ve’, factor=13.0}

Pair{t1=’donanımlı’, t2=’ve’, factor=13.0}

Pair{t1=’öğrencileri’, t2=’ve’, factor=13.0}

Pair{t1=’söyleşilere’, t2=’ve’, factor=13.0}

Pair{t1=’yaratıcı’, t2=’ve’, factor=13.0}

Pair{t1=’eden’, t2=’ve’, factor=13.0}

Pair{t1=’ve’, t2=’30425′, factor=13.0}

Pair{t1=’kültürel’, t2=’ve’, factor=13.0}