|Wikability Score () is our proprietary measure of readability.|
Change Wikability Score
Readability vs Cosine Similarity ¶
- Most of the graphs show a general cone/pyramid pattern. The lower the cosine similarity the higher the spread there is between readability metric values for Document 1 and 2 values (Will be referenced as D1 and D2 values). As cosine similarity increases the readability metric values for both documents become very close.
- For further research into this topic, manually looking at documents with dissimilar D1 and D2 values but high cosine similarity scores could yield more insight into the relationship between cosine similarity and document readability scores
- Documents with similar readability appear to have higher cosine similarity. Since readability takes into account sentence/document structure, that could imply that the word vector frequencies can give insight into sentence structure as well.
- Also appears that higher readability => higher similarity. So the easier a document is to read, the more in common it has with other documents