Qatar Financial Centre Regulatory Authority (QFCRA)
QFC Regulatory Authority is the independent regulator of the Qatar Financial Centre (QFC). Their role is to authorize and regulate firms and individuals conducting financial services in or from the QFC. The Regulatory Authority works closely with a number of public entities and professional organizations on the joint efforts of strengthening Qatar’s regulatory framework and building a legacy of regulation for the State.
The client aimed to build for their customers a personalized news feed that would meet specific requirements. People should get only relevant and real-time information from trusted news sources, so they could make well-informed conclusions and invest money wisely with minimal risk.
Before starting working with us on a custom solution, the client also considered the following options:
Financial supervisors already used KNIME to extract end crawls from a number of reports. The initial idea was to use two or three-word phrases against the API to get precise results, about 50% of the received articles were irrelevant due to different context around the keyphrase, while the main goal was to provide as accurate data as possible.
Financial supervision is not about day trading, so the articles about stock prices are too timely and irrelevant for supervisors and must be removed from the feed.
With KNIME, financial supervisors can use crowdsourcing. For example, if all supervisors collectively label enough news articles, the result would be a good corporate view of what financial supervisors care about. And if there are enough collected labels, financial supervisors can start differentiating services to different teams or individuals depending on their specific interests.
What was done by in-house team of QFCRA:
The input data is presented as a collection of the texts with labels that define the relevance of the paper according to the customer opinion, these labels were provided by the customer. To process these texts we used the Spacy extension for Knime. Spacy is a well-known NLP framework for Python that now can easily be used in Knime with no code at all.
Spacy includes multiple language models (23 languages) and standard utilities such as: tokenization, lemmatization, morphology analysis, stop word filtering, NER and POS tagging and text vectorization. Another benefit is that the Spacy extension for Knime is completely compatible with Knime Text Processing nodes, which were also used in the project.
Once the texts are cleaned, they are ready to be investigated with such algorithms as topic modeling, terms co-occurrence, TF-IDF analysis. The first three algorithms provide descriptive information regarding the collection of texts, they are useful to see the frequency and importance of the terms, build a simple graph based on terms co-occurrence and build a tag cloud.
The later algorithm is useful to resolve the problem of text classification, defining the relevance of the texts. For this purpose two solutions were developed – based on Spacy and based on BERT. As long as Spacy models also can vectorize the texts this feature was used to convert the texts to vectors, which were then used as input to XGBoost tree algorithm. In that case training only took a minute and the accuracy was 75% (with F1 78% and 71% for each class).
Then the same task solved with BERT and training took about 30 minutes and required a GPU. This approach expectedly showed better classification performance with 80% of accuracy (80% and 77% for F1).
This way the customer is given a choice which approach to choose given all the advantages and drawbacks of each approach.
The Redfield team did relevance estimation for incoming news, provided text analysis and dashboard visualization.
This allowed financial supervisors to generate a relevant news feed for the users.
Using different approaches of NLP we managed to create a meaningful business dashboard that represents the main insights from the big collection of texts. We also managed to present multiple solution for text classification of the texts relevance for the users. These solutions are flexible in terms of inference and deployment.
Knime Analytics Platform