balans.bg

balans.bg is a platform containing tens of thousands of pages of legal acts, articles, videos, and other resources which are all interconnected. It's a tailor-made Wikipedia for accountants, lawyers, auditors, and business owners.

  • Knowledge Management
2021

Overview

Client

Kreston Bulmar is the biggest Bulgarian accounting and consulting agency. Over their 25+ years of history, they have accumulated and written tens of hundreds of articles, analyses, commentaries, videos, and even published textbooks.

In 2019 they realized they lacked the means to share their vast knowledge effectively inside and outside the organization. balans.bg was created to manage, organize, and monetize their expertise.

Project

The goal was to create a go-to-place for professionals where they can find the information needed to quickly solve their specific business case.

Challenges & Solutions

There were four main branches of technical challenges:
⠀• data parsing and persistence
⠀• extraction of commonality between resources
⠀• creation of relations between resources
⠀• rendering data and doing it fast enough

Document parsing

All resources in balans.bg start of as .docx files which are converted into HTML via a rich text editor. The HTML is then processed in 3 steps.

Pre-processing

This step ensures that the input HTML is standardized and ready to use in all further algorithms. Some of the tasks it performs include HTML sanitization, removal of extra characters, tags, and styles, and execution of table-and-image-specific algorithms.

Resource parsing

Each resource type has its own blueprint and pre-defined structural entities which are extracted, arranged properly, and assigned CSS classes. The HTML produced is the one persisted in the database.

Post-processing

Before a resource is shown to users it goes through a pipeline of algorithms that are concerned with creating various hints, modals, and hiding content for non-paid users. This is the HTML rendered on the site.

Everything is a resource

Almost everything in balans.bg is a "resource" - calculators, articles, legal acts, videos, news, and textbooks are all resources.

A resource can generate a unique identifier, URLs for users and admins, have authors, connections to other resources, be part of various statistics, search results, and much more.

It was challenging to create an all-encompassing abstraction as sometimes resources are composed of other resources.

Example
Every law is a resource and has versions - one for every time the lawmaker has updated it. Law versions are resources too and contain hundreds of articles, which are resource yet again.

As of now, you can find nearly 4000 unique resources (articles of legal acts excluded), separated into 13 resource categories, which are further broken into 68 types of resources. You won't be able to see the complete list, however, different subsets will be presented to you as groups of relevant resources while you browse balans.bg. The functionality that recommends resources as relevant is enabled via man-made relations between resources.

Relations between resources

Each resource can have multiple relations and almost all do:
⠀• remuneration calculators are connected to the laws that define their arithmetics;
⠀• videos are linked to the legal acts they attempt to simplify;
⠀• authored texts relate to a legal act as a whole or a subset of its articles...


Relations consist of information about two resources and an optional date. They are usually defined by the author upon resource creation and are always peer-reviewed. It may be that a single relation looks insignificant however in the thousands they form a complex web that allows the platform to focus all its knowledge on a narrow use case and to keep its authored materials up-to-date.

Keeping resources updated

Resources go out of date. Articles written months ago no longer apply for laws that are constantly being changed. The possibility of users taking decisions based on stale knowledge was a prime factor in developing relations between resources.

Resources either explain how laws operate or provide real-world examples for selected cases. Authors make sure their content is always accurate by providing a list of law articles that govern the topics discussed. The platform detects whenever any of the law articles are updated and alerts administrators and authors about the resource potentially becoming outdated. This way Kreston Bulmar knows when an authored material is due for review and potentially a re-write.

Provision of relative content

It is often the case that professionals look at older versions of laws. They need to because violations and legal cases are looked at post factum and the applicable legislation is months or years old.

balans.bg allows the user to "time travel" easily as all the versions and all the changes of documents are kept. Everyone can easily go back in time and read up-to-date materials for the exact moment they are interested in. Not only that but most relations are date-bounded which means that the relative materials presented to the user will be of the correct time period.

Knowledge

Complex problems are best solved by dividing them into smaller pieces. This concept is useful as you can then, find the answers scattered on the Internet as single units of information.

We like that approach and even take it one step further. You will surely find the information in balans.bg, but you will find it connected. When the dots connect - you see the full picture. With this kind of relations, we don't simply have information, we have knowledge.

Laws are the backbone from which all relations stem and their articles are the smallest, and primary, units that resources connect to. Laws, and other legal acts, are then organized into topics that concentrate thousands of relations and provide a stable ontology for effective knowledge management.

Example
The topic Value-added tax is the richest - it contains around 4 500 relations between its resources. It is often the case that a single article from the law of the same name has multiple videos, a couple of breakdown articles and real-life examples, a dozen of legal cases and 4-5 other types of resources attached to it.
See an example below.

balans.bg resources attached to law article

Caching

Waiting is not something the system tolerates. Especially as users are searching for a solution to their problem. Slow response times make Google put you in the back of the search results simply not to return. Having this in mind we put extra effort into speeding the site. Part of the solution is good architecture, feature planning, and caching for quick access.

balans.bg is heavily dependent on work being done before it is needed and then caching the result. This approach has consistently provided us with x10 improvements in speed.


Legal acts, for example, go over 2-3 MB of data which is partitioned into a few hundred articles that live in the database. Fetching and arranging everything on the go would lead to unacceptable load times of ~10s, however, getting the cached HTML does the job in a second.

Another example would be the homepage which showcases statistics regarding the types and number of resources in the platform, and further breakdowns for how they are grouped into topics. The calculations take 5 seconds at minimum, however the homepage loads for less than 0.5 seconds, due to the incredible power of caching.

Jobs

Each caching instance provides a snapshot of how things were at the moment of execution, and, naturally, the snapshot needs to be refreshed. The appropriate interval varies between one hour to one day depending on the task being performed.

Tasks define the work that needs to be done, how often it needs to be done, and then schedule processes called jobs, that do the actual work.

Each job is a short-lived process that performs CPU-intensive tasks. It is something we cannot afford to do in real-time so we make the calculations in advance and cache the result.

Basically, we get a loop: the site needs fresh data, so it queues jobs to cache it, and then the cache provides the data dozens of times faster than it would normally take.

Process

Specification

The project requirements were brainstormed, defined, and refined during the first couple of weeks. Teams from both companies spent each day talking, evolving ideas, and putting healthy constraints on the project, and we successfully created a 50-page specification that allowed us to continue with building the initial architecture.

Laying foundations

The developers were co-writers of the specification, so it took them days to design the database architecture and no more than six weeks to get the first database records created through the admin panel.

The initial DB design was extended multiple times, however, no data had to ever be migrated to new tables or re-created manually. This is a token of a job well done as most data migrations are extremely risky and costly.

Keep in mind that the insertion of all documents took 12 man-months!

Communication & Synching

The fact that the clients were also users of the product helped a lot. After all, the platform was created to share knowledge and improve the expertise of each employee in the organization. We would routinely gather feedback and make adjustments so the system was intuitive and user-friendly.

Managers used to meet frequently - at times every other day. At those meetings, managed to prioritize tasks and recalibrate our focus by looking at the big picture. We had to make sure we are on track despite all changes that had to be made on the go.

Handling changing requirements

Working in sprints means you set a timeframe in which you deliver a list of features. It's similar to an agreement you and the client make repeatedly over the course of the project. Each such agreement gets you closer to the end goal. That's what we did in balans.bg

Whenever a bug or a new high-priority functionality had to be implemented in the middle of a sprint cycle we either postponed other features or lengthened the sprint. Following such a philosophy offers both flexibility and predictability. We needed both.

Technologies

The only choice the clients made for us was to use PHP for the back-end. We picked Laravel as the go-to PHP framework for web projects and utilized a great deal of the tools it provided.

We selected MariaDB for the primary database as there were plenty of relations between the models, Redis for caching and session storage, and Elasticsearch for logs, statistics, and, of course, searching.

In consideration of our user-base, the design was kept simple and looking not so different from the other SaaS products they have been using for the last two decades. We used Bootstrap and jQuery as they offered full SEO control, unlike the JS frameworks available at that time.

The tests were written with PHPUnit and Laravel Dusk.

Last but not least, the background processes are managed by Supervisord.

Results

balans.bg is up and running for about a year now. The statistics show a 20% increase in visitors every quarter and 200 new registrations weekly. Even though we are still implementing features the site is heavily used and ranking in the top 5 positions in Google for sought-after resources like remuneration calculators, accounting textbooks, and original articles. We believe the platform has matured enough and we plan to initiate the first marketing campaign in a couple of weeks.

Conclusion

Developing balans.bg has been a challenge. On one hand, we had to parse and analyze Bulgarian and European legislation and on the other hand original articles, videos, positions, textbooks, handbooks, legal cases, etc. We needed proper abstractions to extract the commonalities between resources and define relations between them. There also had to be a powerful administration that manages all resources and makes sense of the data the platform collected. Finally, we had to consider load speeds and SEO and wrap everything in an easy-to-use product that generates revenue. And we manage to accomplish that as well.

Data shows that balans.bg will hit the breakeven point sometime next year - 2023, and the complete market utilization will probably take two more years. Both Kreston Bulmar and Lexis Solutions consider these results a huge success and we are about to continue working together on two new projects.

Looking back, I can tell we wouldn't make it without the constant communication, persistent work, and the trust both parties had in one another. I am sure these are the ingredients to any successful project, and you can be sure Lexis Solutions will always follow these principles.

Originally published at .
Lexis Solutions is a software agency in Sofia, Bulgaria. We are a team of young professionals improving the digital world, one project at a time.
Contact
  • Deyan Denchev
  • CEO and Co-founder
© 2022 Lexis Solutions. All rights reserved.