Kreston Bulmar is the most prominent Bulgarian accounting and consulting agency. Over their 25+ years of history, they have accumulated and written tens of hundreds of articles, analyses, commentaries, videos, and even published textbooks.
In 2019 they realized they lacked the means to share their vast knowledge effectively inside and outside the organization. Balans.bg was created to manage, organize, and monetize their expertise.
The goal was to create a go-to place for professionals where they could find the information needed to solve their specific business cases quickly.
Challenges & Solutions
There were four main branches of technical challenges:
⠀• data parsing and persistence
⠀• extraction of commonality between resources
⠀• creation of relations between resources
⠀• rendering data and doing it fast enough
All resources in balans.bg start as .docx files, which are converted into HTML via a rich text editor. The HTML is then processed in 3 steps.
This step ensures that the input HTML is standardized and ready to use in all further algorithms. Some of the tasks it performs include HTML sanitization, removal of extra characters, tags, and styles, and execution of table-and-image-specific algorithms..
Each resource type has its blueprint and pre-defined structural entities extracted, appropriately arranged, and assigned CSS classes. The HTML produced is the one that persisted in the database.
Before a resource is shown to users, it goes through a pipeline of algorithms concerned with creating various hints, modals, and hiding content for non-paid users. This is the HTML rendered on the site.
Everything is a resource
Almost everything in balans.bg is a "resource" - calculators, articles, legal acts, videos, news, and textbooks are all resources.
A resource can generate a unique identifier, URLs for users and admins, have authors, connections to other resources, be part of various statistics and search results, and much more.
It was challenging to create an all-encompassing abstraction as sometimes resources are composed of other resources.
Every law is a resource and has versions - one for every time the lawmaker has updated it. Law versions are resources, too, and contain hundreds of articles, which are resources yet again..
The functionality that recommends resources as relevant is enabled via man-made relations between resources. Currently, you can find nearly 4000 unique resources (articles of legal acts excluded), separated into 13 resource categories, which are further broken into 68 types of resources. You won't be able to see the complete list; however, different subsets will be presented to you as groups of relevant resources while you browse balans.bg.
Relations between resources
Each resource can have multiple relations and almost all do:
⠀• remuneration calculators are connected to the laws that define their arithmetics
⠀• videos are linked to the legal acts they attempt to simplify
⠀• authored texts relate to a legal act as a whole or a subset of its articles.
Relations consist of information about two resources and an optional date. The author usually defines them upon resource creation and is always peer-reviewed. It may be that a single relation looks insignificant; however, in the thousands, they form a complex web that allows the platform to focus all its knowledge on a narrow use case and to keep its authored materials up-to-date.
Keeping resources updated
Resources go out of date. Articles written months ago no longer apply for laws that are constantly being changed. The possibility of users taking decisions based on stale knowledge was a prime factor in developing relations between resources.
Resources either explain how laws operate or provide real-world examples for selected cases. Authors make sure their content is always accurate by providing a list of law articles that govern the topics discussed. The platform detects whenever any of the law articles are updated and alerts administrators and authors about the resource potentially becoming outdated. This way Kreston Bulmar knows when an authored material is due for review and potentially a re-write.
Provision of relative content
It is often the case that professionals look at older versions of laws. They need to because violations and legal issues are looked at post-factum, and the applicable legislation is months or years old.
Balans.bg allows the user to "time travel" easily as all the versions and document changes are kept. Everyone can quickly go back in time and read up-to-date materials for the exact moment they are interested in. Not only that, but most relations are date-bounded, meaning that the relative materials presented to the user will be of the correct time period.
Complex problems are best solved by dividing them into smaller pieces. This concept is helpful as you can then find the answers scattered on the Internet as single units of information.
We like that approach and even take it one step further. You will indeed find the information in balans.bg, but you will find it connected. When the dots connect - you see the complete picture. With this kind of relations, we don't simply have information but knowledge.
Laws are the backbone from which all relations stem and their articles are the most minor and primary units that resources connect to. Laws, and other legal acts, are then organized into topics that concentrate thousands of relations and provide a stable ontology for effective knowledge management.
Value-added tax is the most prosperous topic - it contains around 4 500 relations between its resources. A single article from the law of the same name often has multiple videos, a couple of breakdown articles and real-life examples, a dozen legal cases, and 4-5 other types of resources.
See an example below.
Waiting is not something the system tolerates. Especially as users are searching for a solution to their problems. Slow response times make Google put you in the back of the search results simply not to return. Having this in mind, we put extra effort into speeding up the site. Part of the solution is good architecture, feature planning, and caching for quick access.
Balans.bg heavily depends on work being done before it is needed and then caching the result. This approach has consistently provided us with x10 improvements in speed.
Legal acts, for example, go over 2-3 MB of data partitioned into a few hundred articles that live in the database. Fetching and arranging everything on the go would lead to unacceptable load times of ~10s; however, getting the cached HTML does the job in a second.
Another example would be the homepage which showcases statistics regarding the types and number of resources in the platform and further breakdowns for how they are grouped into topics. The calculations take 5 seconds at a minimum; however, the homepage loads for less than 0.5 seconds due to the incredible caching power.
Each caching instance provides a snapshot of how things were at the moment of execution, and naturally, the snapshot needs to be refreshed. The appropriate interval varies from one hour to one day, depending on the task performed.
Tasks define the work that needs to be done and how often it needs to be done, and then schedule processes called jobs that do the actual work.
Each job is a short-lived process that performs CPU-intensive tasks. We cannot afford to do it in real time, so we make the calculations in advance and cache the result
We get a loop: the site needs new data, so it queues jobs to cache it, and then the cache provides the data dozens of times faster than it would typically take.
The project requirements were brainstormed, defined, and refined during the first couple weeks. Teams from both companies spent each day talking, evolving ideas, and putting healthy constraints on the project. We successfully created a 50-page specification that allowed us to continue with building the initial architecture.
The developers were co-writers of the specification. It took them days to design the database architecture and no more than six weeks to get the first database records created through the admin panel.
The initial DB design was extended multiple times; however, no data was migrated to new tables or re-created manually. This is a token of a job well done, as most data migrations are extremely risky and costly.
Keep in mind that the insertion of all documents took 12 person months!
Communication & Synching
The fact that the clients were also product users helped a lot. After all, the platform is to share knowledge and improve the expertise of each employee in the organization. We routinely gathered feedback and made adjustments so the system was intuitive and user-friendly.
Managers used to meet frequently - at times every other day. At those meetings, we managed to prioritize tasks and recalibrate our focus by looking at the big picture. We had to make sure we were on track despite all changes that had to be made on the go.
Handling changing requirements
Working in sprints means you set a timeframe to deliver a list of features. It's similar to an agreement you and the client repeatedly make over the course of the project. Each such agreement gets you closer to the end goal. That's what we did in balans.bg
Whenever a bug or a new high-priority functionality had to be implemented in the middle of a sprint cycle, we either postponed other features or lengthened the sprint. Following such a philosophy offers both flexibility and predictability. We needed both.
The only choice the clients made for us was to use PHP for the back end. We picked Laravel as the go-to PHP framework for web projects and utilized many of its tools.
We selected MariaDB for the primary database as there were plenty of relations between the models, Redis for caching and session storage, and Elasticsearch for logs, statistics, and, of course, searching.
Considering our user base, the design was kept simple and looked not so different from the other SaaS products they have been using for the last two decades. We used Bootstrap and jQuery as they offered complete SEO control, unlike the JS frameworks available at that time.
The tests were written with PHPUnit and Laravel Dusk
Last but not least, the background processes are managed by Supervisord.
Balans.bg has been up and running for about a year now. The statistics show a 20% increase in visitors every quarter and 200 new registrations weekly. Even though we are still implementing features, the site is heavily used and ranking in the top 5 positions in Google for sought-after resources like remuneration calculators, accounting textbooks, and original articles. We believe the platform has matured enough, and we plan to initiate the first marketing campaign in a couple of weeks.
Developing balans.bg has been a challenge. On the one hand, we had to parse and analyze Bulgarian and European legislation; on the other hand, original articles, videos, positions, textbooks, handbooks, legal cases, etc. We needed proper abstractions to extract the commonalities between resources and define relations between them. There also had to be a powerful administration that manages all resources and makes sense of the data the platform collected. Finally, we had to consider load speeds and SEO and wrap everything in an easy-to-use product that generates revenue. And we manage to accomplish that as well.
Data shows that balans.bg will hit the breakeven point sometime next year - 2023, and the complete market utilization will probably take two more years. Kreston Bulmar and Lexis Solutions consider these results a considerable success, and we are about to continue working together on two new projects.
Looking back, we can tell we wouldn't make it without the constant communication, ongoing work, and trust both parties had in one another. I am sure these are the ingredients to any successful project, and you can be sure Lexis Solutions will always follow these principles.