Arda Çetinkaya

Mayıs 17, 2025

AI ile iş çözümlerini etkili hale getirmek: RAG ve Fine-Tuning

AI, Fine-tuning, LLM, RAG

/ Leave a comment / ~ 4 dakikada okuyabilirsiniz.

Son 1-2 sene fark ettiğiniz veya içinde de bulunduğunuz gibi, yapay zeka (AI) becerilerinin etkili ve verimli olabilmek için kullanıldığı ve kritik hale geldiği bir dönemdeyiz. Günümüzde işletmeler/kurumlar, büyük dil modellerini (LLM – Large Language Models) kullanarak fayda sağlamaya çalışıyorlar. Sadece bazı LLM servisleriyle verimlilik kazanmakla kalmayıp, işletmeler artık kendi iş değerlerini bu modellere entegre ederek daha etkili olmanın yollarını arıyorlar. Bu yazımda, LLM’leri kullanarak iş değeri üretme stratejilerinden bahsedeceğim. Şu anda iki sıcak ve önemli konu var: Retrieval-Augmented Generation (RAG) ve Fine-Tuning… Her ikisi de üretken yapay zeka (generative AI) konsepti içinde işletmelere özel verileri yapay zeka dil modelleri ile entegre etmenin birer stratejisi. Bu iki yöntemle işletmeler, yapay zeka kapsamında iş değerlerini daha verimli şekilde sağlayabilirler veya sunabilirler. Tabii her yöntemin avantajları ve dezavantajları var; ve hangi yöntemin tercih edilebileceği de ihtiyaçlara göre değişiyor.

Bu yazıyı 27 Nisan 2025 tarihinde ilk olarak İngilizce yazmıştım, buradan o versiyonuna da ulaşabilirsiniz. Bayadır kendi sayfamda yazmaya fırsat bulamıyorum. Farklı platformlarda yazdığım yazıların birer kopyasını da buraya taşıyabilirim, kendi arşivime de katkı…😁🫣

Retrieval-Augmented Generation (RAG)

RAG, bir yapay zeka modelinin bir soruya cevap vermeden önce büyük bir veri deposundan gerekli ek bilgi çekmesi yöntemidir. Açıklamak direkt Türkçeye çevirmekten daha kolay.. Daha basit bir ifadeyle, AI “Dur bir bakim, önce bir Google’layim…” der gibi davranır. Buradaki “Google’lamak”, tanımlı veri parçalarının bulunduğu bir veri deposunda hızlıca arama yapmak şeklinde yorumlanabilir.

Bu bağlamda, veri deposu organizasyona özgü verileri, yani işletmenin değerlerini temsil eden bilgileri içerir. RAG stratejisinin isminden de anlaşılacağı üzere iki adımı vardır: “Retrieval” (Veri çekme) ve “Augmented” (Zenginleştirme). Öncelikle, veriler bazı algoritmalarla veri deposundan çekilir. Ardından, çekilen sonuçlar LLM kullanılarak zenginleştirilir ve daha etkili cevaplar elde edilir. Dolayısıyla, LLM’e iyi veri sunmak için “retrieval” (veri çekme) adımı oldukça önemlidir. Bu verileri doğru ve alakalı şekilde çekebilmek için verilerin uygun bir şekilde hazırlanması önemli bir adımdır. Bu adımda vektör veri tabanları (vector databases) devreye girer.

Vektör veri tabanları, verileri sayı dizileri (vektörler) olarak saklar. Bu vektörler daha sonra benzerliklerine göre sorgulanabilir. Böylece benzer veriler çok daha verimli şekilde bulunabilir.

Verilerdeki benzerlikleri anlayabilmek için LLM’ler kullanılarak bu verilerden benzerlikler çıkarılır. İşletmeye ait veriler vektör veri tabanına aktarılırken embedding (gömülü temsil) formatına çevrilir. Örneğin, benzer anlamlara sahip kelimeler (“kedi” ve “köpek” gibi) birbirine yakın vektörlere sahip olurken, alakasız kelimeler (“kedi” ve “araba” gibi) birbirinden uzak olur.

Embedding işlemi için bazı özel LLM modelleri kullanılır. Böylece veriler vektör veri tabanına daha anlamlı ve ilişkili olarak gömülür. Daha sonra, bir soru geldiğinde bu soru da embedding’e çevrilir ve veri tabanından en alakalı veri parçaları çekilip LLM ile zenginleştirilerek daha güvenilir cevaplar sunulur.

RAG stratejisini daha anlaşılır hale getirmek için örnek bir süreç oluşturalım:

Adım 1: Tüm iş verilerini topla
- PDF, Word, web sayfaları, veri kayıtları… vb. Belgeler, user-stotry’ler, destek kayıtları, wiki içerikleri olabilir.
Adım 2: Veriyi hazırla
- Verileri temizle; tekrarlayan verileri kaldır, büyük belgeleri küçük parçalara böl (Bu, AI’ın veriyi daha kolay işlemesi ve embedding oluşturması için faydalıdır)
Adım 3: Etiket ekle (opsiyonel)
- Veri parçalarına etiket/etiketler eklemek sistemi daha verimli hale getirir. Sonuçta tüm bunları işimizi daha etkili hale getirmek için yapıyoruz, değil mi? 😀
Adım 4: Verileri embedding formatına çevir
- Bir embedding modeli kullanarak veriyi benzerliklerine göre vektörlere dönüştür.
Adım 5: Vektörleri vektör veri tabanında sakla
- Couchbase, MongoDB, Azure Cosmos DB, PostgreSQL gibi veri tabanlarında veriyi ve vektörleri sakla.
Adım 6: Soruya göre veriyi zenginleştir
- Soru, bir LLM modeliyle embedding’e çevrilir
- Vektör veri tabanında benzer embedding’ler aranır, bu noktada kullanılan veri tabanı önemli
- Benzer veri parçaları çekilir
- Çekilen veri LLM ile birleştirilerek anlamlı cevap oluşturulur

RAG stratejisi, var olan bir çözüm için ekstra bir eklenti gibi diyebilirim. Ekstra LLM eğitimi veya büyük bir hesaplama gücüne gerek olmaması artı noktası. Zor olan kısım veri hazırlığı olabilir diye düşünüyorum. Zor değil, ama veri yapısı düzensizse biraz uğraştırabilir. LLM çıktılarının etkinliği için verinin düzgün yorumlanabilir olması oldukça önemli.

Veri bir kez düzenli hale getirildikten sonra, RAG stratejisi verinin güncellenmesi/değişmesi konusunda da büyük kolaylık sağlar. Veri değiştikçe sadece veri deposunun güncellenmesi yeterlidir. Bu da daha güncel sonuçlar sağlar.

Fine-tuning (Modeli İnce Ayarlarla Yeniden Eğitme)

Fine-tuning, iş verilerini bir modele gömmenin bir yoludur. Bu yöntemde, önceden eğitilmiş bir model alınır ve belirli bir iş verisi seti ile yeniden eğitilir. Daha basit anlatımla, AI’a cevap vermeden önce “ders aldırmak” gibidir. Yani işletmenin bilgisini AI’a öğretmek diyebiliriz “Fine-tuning” için…

Bu yöntem RAG kadar basit değil diye düşünüyorum. Eğitme süreciyle ilgili daha derin bilgi gerektirir. Ayrıca, etkili sonuçlar almak için kullanılan önceden eğitilmiş modelin kaliteli ve konuyla alakalı olması gerekir. Ancak tahmin edebileceğiniz gibi, bu her zaman kolay veya mantıklı değil. Örneğin sağlık sigortası alanında çalışıyorsanız, sağlık odaklı bir ön model daha faydalı olur. Mevcut bazı modeller (Llama, Phi gibi) kullanılabilir, ancak bu durumda işletmenizin verisi yeterince iyi değilse, sonuçlar da o kadar etkili olmayabilir. Çünkü bu modellerin çoğu sınırlı ve genel veriyle eğitilmiştir.

Diyelim ki elinizde kaliteli bir ön model var, yine de eğitme süreci hakkında iyi bilgi sahibi olmak gerekir. Doğru parametrelerle eğitim yapmak ve testler gerçekleştirmek zaman alır. Eğitim süreci haftalar sürebilir. Eğer veriler sık sık değişiyorsa, bu yöntem güncel sonuçlar sunmakta zorlanabilir. Ayrıca bu süreçler ciddi bir işlem gücü gerektirir, yani “maliyet” demek.

Fine-tuning’i kötü bir yöntem gibi göstermek istemem. Ancak bence mevcut LLM’lerle, işletmelerin işe başlamak için kullanacağı ilk strateji olmamalı. Tabii eğer işletme veri mühendisliği ve makine öğrenimi konusunda olgunlaşmışsa, kendi modelini oluşturup spesifik verilerle yeniden eğiterek daha etkili olabilir.

Gelecekte, Dünya Sağlık Örgütü (WHO), Unicef, Greenpeace, AB, Unesco gibi büyük küresel organizasyonların belirli alanlara özel LLM’ler yayınlayabileceklerine inanıyorum. Tıpkı rapor yayınlar gibi… Bu da işletmelerin fine-tuning yapmasını daha kolay hale getirebilir.

Sonuç

“RAG”, AI’ın önce bir Google araması yapması; “Fine-tuning”, gitarın tellerini çalmadan önce akort etmek gibi diyebiliriz.

Her iki stratejinin de avantajları ve dezavantajları var. Ve bu avantaj/dezavantajlar tamamen ihtiyaçlara ve işletmenin olgunluk düzeyine göre değişiyor. Bana sorarsanız, şu an için iş değeri açısından etkili ve verimli olması nedeniyle RAG stratejisine oy veririm.

Aşağıda düşüncelerimi tablo halinde özetledim:

Strateji	RAG	Fine-tuning
Temel Kavram	AI iş verisini dışarıdan çeker	AI, iş verisiyle yeniden eğitilir
Daha uygun olduğu durum	Sık değişen veriler	Statik ve uzmanlık içeren veriler
Maliyet	Düşük	Yüksek
LLM Uzmanlığı Gerekir mi?	Hayır	Evet

Görüşleriniz, deneyimleriniz ya da sorularınız varsa aşağıya yorum bırakabilirsiniz. Başka bir yazıda görüşmek üzere…👋🏼

AI, Fine-tuning, LLM, RAG

Nisan 13, 2024

Some upcoming trends for software development world…

AI, software development

/ Leave a comment / ~ 3 minutes read.

Every year one of my friends (@Muhammed Hilmi Koca) in developer community in Turkey curates some insights about software development trends. Lots of skilled and experienced colleagues shares their ideas and thoughts. And thanks to my friend, this year he also shared some spot for me to share some of my ideas.

Even if the original full text is in Turkish, I really suggest you to try to check. There are really good insights.

I also wanted to share mine as translated in my blog. So here are some thoughts about upcoming software development trends…

I believe that 2024 will be a year where artificial intelligence is scrutinized in the software world, and it will begin to be utilized more efficiently. Over the past 3-4 years, we have endeavored to understand the advancements in artificial intelligence methods and tools with a bit of amusement. We laughed and enjoyed ourselves by creating visuals and texts with “Generative AI” solutions… We also experienced that artificial intelligence tools are able to generate code at a very proficient level. With 2024, I anticipate that artificial intelligence tools will become a more significant player in software processes.

I anticipate that processes will strive to become more efficient by starting to prefer or incorporate artificial intelligence in “code review” processes or “re-factoring” requirements. There are already companies attempting to integrate tools like GitHub Copilot into their code integration processes…

As these steps begin to materialize, in the medium term, companies may start integrating language models tailored to their own code inventories into their code development processes. I believe that resolving a business requirement through artificial intelligence will enable companies or organizations to address their needs while maintaining their own standards.

Programming Languages

I believe there is now a more informed approach to programming languages or platforms, with an awareness of their advantages and use cases. We no longer defend a language or platform to the death as we used to. Right? Or do we still defend them? 🤦🏻‍♂️Oh, I hope not… I also believe that programming languages have reached a certain level of maturity. Innovations are now introduced regularly according to real needs. Consequently, I believe that the most suitable solution can now be chosen according to the specific requirements. However, I sense that the Rust language, which has gained more attention in the last 2-3 years, will stand out more this year in terms of needs, performance, and resource usage. (Note to self: Finish reading the O’Reilly Programming Rust book already)

No Code

The inclusion of artificial intelligence in the game will draw a little more attention to “no code” tools. Although as programmers we may still not find them very reliable, the involvement of individuals who do not know programming in the game with “no code” tools will not only begin this year but also lead to other developments in the medium term.

Cloud

“Cloud Platforms” have now become the “default”. The ease of access to platforms and the maturity of services will maintain the position of cloud platforms in the software world. However, I think the “cloud exit strategy”, which is always on the agenda due to costs or global conditions, may gain a little more importance. Also because of some business sustainability requirements and regulations having or being able to have some alternatives will be important topic. “Cloud-native” solutions to eliminate platform dependency will still be a hot topic in 2024 and should be…

We’ve been talking about IoT for a long time, but honestly, nothing has turned out the way I imagined. The Covid period affected the integration of 5G to some extent and consequently slowed down the interaction opportunities between devices, I think. However, the increasing popularity of VR and AR glasses(again) will underscore the interaction between the virtual world and the real world, and so IoT solutions will start to become more visible.

Prompt Engineering

I started with artificial intelligence in the post, and I’m ending with artificial intelligence. With 2024, I believe that “prompt engineering” competency will begin to gain importance for programmers. Artificial intelligence will not replace the jobs of programmers, but I think programmers who can interact more easily and consciously with artificial intelligence will be one step ahead. I anticipate that this awareness will gradually be sought in job advertisements because questions asked to AI tools or expressions shared with them affect the quality of the expected results. Perhaps it’s a bit utopian, but I believe we’ll see it happen…In 2-3 years having a good skill to know about prompt engineering for AI tools is going to be important as knowing a software development principle.

Until we meet again, happy coding…

AI, software development

Şubat 3, 2024

Keyed Dependency Injection in .NET 8

.NET, ASP.NET Core, C#

/ Leave a comment / ~ 3 minutes read.

.NET 8 brings some great improvements, making it a key milestone for the “new .NET” with cool new features and better performance. In this post, I want to share a feature I really like, and it’s minor but handy – “Keyed Dependency Injection (DI)”

Keyed DI provides registering services with some user-defined keys and consuming those services by those keys. Yes, I can hear that you say, “But I can already do this”. This is already possible with AutoFac, Unity…etc. Because of this I am calling this new feature minor. But I think it is great to have this feature within.NET platform as a built-in feature.

Before delving into the specifics of Keyed DI, let’s consider its purpose, especially for those encountering it for the first time. Picture a scenario where different service implementations or configurations are necessary due to specific business requirements, such as distinct cache provider implementations.

builder.Services.AddSingleton<ICacheProvider>(provider => new RedisCacheProvider("SomeFancyServer:1234"));
builder.Services.AddSingleton<ICacheProvider>(new UltimateCacheProvider("MoreFancyServer:5432"));

In traditional host or container services, retrieving the required service from registrations posed challenges. How could one seamlessly inject a particular cache provider into a specific API?

Keyed Dependency Injection (DI)

With the new keyed DI services feature in .NET 8, now we can define some keys to register services and then we can use those keys when consuming the services. Let’s look at this with a simple example.

Let’s have two different implementations because of some fancy business requirements for an interface as below.

Please check the GitHub link at the end of the post for full example codes.

public interface IProductService
{
    List<string> ListProducts(int top = 5);
}

public class AmazonProducts : IProductService
{
    public List<string> ListProducts(int top = 5)
    {
        return new List<string>{
            "Fancy Product from Amazon",
            "Another Fancy Product from Amazon",
            "Top Seller Product from Amazon",
            "Most Expensive Product from Amazon",
            "Cheapest Product from Amazon",
            "Some Shinny Product from Amazon",
            "A Red Product from Amazon",
            "A Blue Product from Amazon",
            "Most Tasty Cake from Amazon",
            "Most Biggest Product from Amazon",
       }.Take(top).ToList();
    }
}

public class CDONProducts : IProductService
{
    public List<string> ListProducts(int top = 5)
    {
        var ran = new Random();
        return new List<string>{
            "Fancy Product from CDON",
            "Another Fancy Product from CDON",
            "Top Seller Product from CDON",
            "Most Expensive Product from CDON",
            "Cheapest Product from CDON",
            "Some Shinny Product from CDON",
            "A Red Product from CDON",
            "A Blue Product from CDON",
            "Most Tasty Cake from CDON",
            "Most Biggest Product from CDON",
       }.OrderBy(x => ran.Next()).Take(top).ToList();
    }
}

And let’s have some web API which is exposing the following service.

public class ProductsAPI
{
    private IProductService _service;
    public ProductsAPI(IProductService service)
    {
        _service = service;
    }

    public static async Task<ActionResult<List<string>>> GetProducts()
    {
        return _service.ListProducts();
    }
}

So far, they should be very familiar to you. No rocket science. So, let’s register those 2-service implementations with some keys.

builder.Services.AddKeyedScoped<IProductService,AmazonProducts>("amazon");
builder.Services.AddKeyedScoped<IProductService,CDONProducts>("cdon");

//There are also .AddKeyedSingelton() and .AddKeyedTransient() methods as usuall

Within new service registering methods, we can register those services with some keys. In here, we are registering AmazonProducts implementation with “amazon” key and CDONProducts implementation with “cdon” key so that the implementations can be used used within keys.

And now these services can be consumed with keys. There is a new attribute in .NET 8 as FromKeyedServices(key). When injecting a service with this attribute, a defined key that is used while registering the service can be used. So, the service registered with that key will be injecting into the host service.

public ProductsAPI([FromKeyedServices("amazon")]IProductService service){
    _service = service;
}

//OR this attribute can also be used within method parameter

public static async Task<ActionResult<List<string>>> GetProducts([FromKeyedServices("amazon")]IProductService service)
{
    return service.ListProducts();
}

No more tricky workarounds to incorporate different service implementations. For instance, if a new requirement emerges, like exposing CDON products, a simple key change in the web API is all that’s needed.

This was a very simple example, but I hope it helped to get the idea with the keyed DI services in .NET 8.

Bonus!!!

While the above example is straightforward, it effectively conveys the power of keyed DI services. A similar approach can be applied to configuration APIs in .NET, although not a new feature like keyed DI services, showcasing its versatility in managing configuration bindings.

Consider a scenario where there’s a uniform configuration structure per service/component/module as below.

{
  "ProductService": {
    "Amazon": {
      "top":3
    },
    "CDON": {
      "top":10
    }
  }
}

These configurations can be bound with keys/names, allowing for seamless binding of required configuration values into a service.

builder.Services.Configure<ListOptions>("amazon", builder.Configuration.GetSection("ProductService:Amazon")); 
builder.Services.Configure<ListOptions>("cdon", builder.Configuration.GetSection("ProductService:CDON"));

So within “amazon” key some different configuration values are set for configuration options and with “cdon” different configuration options are set.

And within IOptionsSnapshot<T>.Get() it is possible to get named configuration option as below.

It is important to use IOptionsSnapshot for named configurations. IOptions was not supported for named configurations. Maybe I can have some another post about IOptions, IOptionsSnapshot and also IOptionsMonitor

public static async Task<ActionResult<List<string>>> GetProducts([FromKeyedServices("amazon")]IProductService service, IOptionsSnapshot<ListOptions> config)
{
    var listOptions = config.Get("amazon");
    return service.ListProducts(listOptions.ListCount);
}

public static async Task<ActionResult<List<string>>> GetAlternativeProducts([FromKeyedServices("cdon")]IProductService service,IOptionsSnapshot<ListOptions> config)
{
    var listOptions = config.Get("cdon");
    return service.ListProducts(listOptions.ListCount);
}

This was short and quick post, but I hope it will help you to have some awareness for a new feature in .NET 8. Happy coding until see you in the next article.

Please check the the following GitHub link for all full code and implementations.
https://github.com/ardacetinkaya/Demo.KeyedService

.NET, ASP.NET Core, C#

Mart 26, 2023

Protecting data with ITimeLimitedDataProtector interface in ASP.NET Core

.NET, ASP.NET Core, data protection

/ Leave a comment / ~ 5 minutes read.

I think it is quite important to be proficient in the APIs of the “framework” or “library” that we work on, in addition to the language we use when developing applications. This enables us to easily provide certain requirements or to use the “framework” more effectively. With this approach, I will try to talk about a ITimeLimitedDataProtector API in ASP.NET Core which can be useful for creating secure or limited data models that may be needed for different scenarios.

Temporary data models or “text” expressions that are valid for only a certain period can sometimes be an approach that we need. Links sent for “Email Confirmation” or for resetting passwords during membership transactions may be familiar to many people. Or values such as codes that will be valid for a certain period in “soft-OTP” (One-time password) scenarios or “Bearer” tokens…etc.

Obviously, different methods and approaches are possible for such requirements. Without going into too much detail, I will try to briefly discuss how we can meet such needs in the .NET platform.

As you know, .NET and especially ASP.NET Core guide us with many APIs to meet the security needs of today’s applications. Strong encryption APIs, HTTPS concepts, CORS mechanisms, CRSF prevention, data protection, authentication, authorization, secret usage, and so on…

ITimeLimitedDataProtector

For the requirement I mentioned above, let’s look at the ITimeLimitedDataProtector interface in .NET under the “data protection” namespace. We can have some implementations for data or expressions that will only be valid for a certain period of time with the methods provided by this interface.

To use the methods of this interface, we first need the “Microsoft.AspNetCore.DataProtection.Extensions” package. Generally, this package is a library that exposes “data protection” features in .NET.

To use the “ITimeLimitedDataProtector” interface, we first need to create a “DataProtectionProvider”, and then define a protector that will protect our data with this “provider”.

var timeLimitedDataProtector = DataProtectionProvider.Create("SomeApplication")
    .CreateProtector("SomeApplication.TimeLimitedData")
    .ToTimeLimitedDataProtector();

When you look at the parameters of the methods here, the “string” expressions you see are important; they can be thought of as a kind of labeling for the created provider and DataProtectors. According to this labeling, the purpose and scope of data security are specified. These expressions are used in the creation of the keys that will be used to protect the data. Thus, a provider created with DataProtectionProvider.Create(“abc”) cannot access the expressions that ensure the security of a provider created in the form of DataProtectionProvider.Create(“xyz”).

When you look at the parameters of the DataProtectionProvider.Create() method, you can see that you can set some properties for protecting the data. You can specify a directory where the keys for data protection will be stored or that the keys will be encrypted with an additional certificate using X509Certificate2. I won’t go into too much detail about these, but what I want to emphasize here is that it is possible to customize data protection methods and change protection approaches with parameters.

In this way, we protect the expression we want to protect through the timeLimitedDataProtector variable we created by specifying a time interval with the Protect() method.

ProtectedData = timeLimitedDataProtector.Protect(plaintext: "HelloWorld"
                    , lifetime: TimeSpan.FromSeconds(LifeTime));

With the above expression, we are encrypting, hashing, and protecting the phrase “Hello World”. Our ProtectedData property becomes a structure similar to the following, which is valid for 20 seconds.

Time limit

We can specify any time duration in the form of TimeSpan with the lifetime parameter of the Protect() method, of course.

After protecting the encrypted and hashed expression, we can access the “Hello World” expression again by opening it with the Unprotect() method within 20 seconds as in this example. However, it is not possible to access this value after 20 seconds, and the data we protected loses its validity.

string data = timeLimitedDataProtector.Unprotect(protectedData);

It is not recommended to use data protection for a long or indefinite period of time with this API. The reason is the risk of maintaining the continuity of the keys used for encrypting and hashing the data when it is protected. If there are expressions that need to be kept under protection for a long time, it is possible to proceed with different methods, or different developments can be made according to our own needs using the interfaces provided by this API.

An important point is that only “text” expressions can be protected. Therefore, it is possible to protect slightly more complex data by “serializing” it (for example, using JsonSerializer).

To see the complete picture more clearly, let’s look at the below code of a Razor page model from an ASP.NET Core application as an example.

namespace SomeApplication.Pages
{
    using Microsoft.AspNetCore.DataProtection;
    using Microsoft.AspNetCore.Mvc.RazorPages;
    using Microsoft.Extensions.Logging;
    using System;
    using System.Text.Json;
 
 
    public class IndexModel : PageModel
    {
        private readonly ILogger<IndexModel> _logger;
 
        public string ProtectedData { get; private set; }
        public string Data { get; private set; }
        public int LifeTime { get; private set; } = 300;
        public string Error { get; private set; }
 
 
        public IndexModel(ILogger<IndexModel> logger)
        {
            _logger = logger;
        }
 
        public void OnGet(string protectedData)
        {
            var timeLimitedDataProtector = DataProtectionProvider.Create("SomeApplication")
                .CreateProtector("SomeApplication.TimeLimitedData")
                .ToTimeLimitedDataProtector();
 
            //prtecteddata variable is empty in URL
            if (string.IsNullOrEmpty(protectedData))
            {
                //Let's have a simple data model as example
                var data = new SomeDataModel
                {
                    Name = "Arda Cetinkaya",
                    EMail = "somemail@mail.com",
                    SomeDate = DateTimeOffset.Now
                };
 
                //Let's serialize this simple data model
                string jsonString = JsonSerializer.Serialize(data, new JsonSerializerOptions
                {
                    WriteIndented = true
                });
 
                Data = jsonString;
 
                //Now let's protec the simple data model
                ProtectedData = timeLimitedDataProtector.Protect(plaintext: jsonString
                    , lifetime: TimeSpan.FromSeconds(LifeTime));
            }
            else
            {
                //When URL have some variable value as ?protecdata=a412Fe12dada...
                try
                {
                    //Unprotect the protected value
                    string data = timeLimitedDataProtector.Unprotect(protectedData);
                    Data = "Data is valid";
                }
                catch (Exception ex)
                {
                    Error = ex.Message;
 
                }
 
            }
        }
    }
 
    public class SomeDataModel
    {
        public string Name { get; set; }
        public string EMail { get; set; }
        public DateTimeOffset SomeDate { get; set; }
    }
}

In the example above, we are protecting a JSON expression for 20 seconds and associating it with a link. The link will be valid for 20 seconds and the value we have protected will be valid as well. However, after 20 seconds, the protected data will expire and lose its validity.

This simple and quick writing, after a long break, will hopefully benefit me and open a door for you to clear up some question marks and provide you with some benefits in your various solutions. See you in the next article.

I have written this post and publish as Turkish before. This is english translated version of that post.

.NET, ASP.NET Core, data protection

Ocak 8, 2023

The curse of software development; “Assumptions”

software development

/ Leave a comment / ~ 4 minutes read.

There is some curse on the software development processes. A curse that everyone knows but cannot escape. “Assumptions”

Assumption: a thing that is accepted as true or as certain to happen, without proof.
Oxford Languages

During design and implementation processes of software development projects, assumptions are made. These might be somehow normal. But if these assumptions are not based on some data or they are not known as same for every stakeholder in the project team, then assumptions turn into a dark curse and this dark curse might show its worst side without knowing when.

If we have a problem or a requirement in our business model, we also expect to solve it in a consistent way. When we think that we solved the problem but if it is happing again then it’s obvious that we couldn’t solve it as we expected. To minimize this issue and to have a consistent solution we use help of software solutions.

But with assumptions, sometimes, we are implementing these software solutions very hard and complex. This complexity is causing some other problems or solution times become too long. And as you know, these are not expected or wanted outcomes in any business.

We live in an era where change is inevitable and there are many unknown parameters. This fact is the best friend of assumptions. When we have a problem or requirement and if there are too many parameters, sometimes finding suitable values for parameters can be a difficult or time-consuming task for this era. But we must come up with a solution somehow, and this is where assumptions come into play. It’s not so bad, even necessary to make assumptions if we can base assumptions on some data. As all we know, we solved lots of mathematical questions in school, assuming x is equal to 3 or x is between 0-9.

Have some data…(at least a little)

We need to make assumptions according to some data. And then doing the implementations according to these data-based assumptions will easier and more legit. The outcome of assumptions won’t be a surprise. Doing a development with non-data-based assumption can create some output. But the consistency of this output will remain unknown. And this will create a risk in the solution as in the business.

So, we need to try to support our assumptions with some data in software solutions. Monitoring and gathering data, some proof-of-concept data or having answers for questions are the main source of data. And these are not one time job, should be done continuously while software solutions live.

Do documentation…

Everybody might have their own assumptions. If these assumptions are not documented well and not shared/known by some other stakeholder, then they are the potential root causes of some upcoming problems. There are some cases that some assumptions are made on other non-data-based assumptions. And if there is no additional document or data about these assumptions, these might be a grenade with pulled pin. Or every point of view of solution might be different. And this is not a good thing for consistency.

So, there should be some documentation for assumptions. Within this documentation, the reason and validity of the assumption should be described. It is crucial to make this document is up to date.

Don’t cause over-engineering…

Software developers are(might 😁) more focused on the solution than the problem time to time. Sometimes any kind of approach(?) might be implemented for the requirement/problem without thinking the exact problem. Because of this we have this “over-engineering” idiom in terminology. And when assumptions join with love of doing fancy things results might not be as expected. And, if we have “overestimation” as a side-dish then I guarantee that there is going to be some errors and problems in the journey.

Because of assumptions, unnecessary implementations and high complexity in code base will start to exist. And I am not sure if this is a good for any code base. Making everything as generic feature, unnecessary configurations or violation of YAGNI(You aren’t gonna need it) principle is just some basic example outcomes of making assumptions.

So, within implementation process, if we have questions or unknowns, instead of making assumptions we need to try to find answers for these. Because of the project’s current state maybe it is very hard to find answers. With above data and documentation approach we can have some assumptions with tests. If we can have tests for our assumed implementations, then it will be easier to manage assumptions.

Somehow assumptions can be inevitable. If they are inevitable, then we need to know how to handle them or know to make a good assumption. Briefly to have a consistent software solution;

Make assumptions based on some data.
Document your assumptions.
Test and try to validate your assumptions.

See you on next post, until then happy coding. And remember just because the sun has risen every day, it doesn’t mean that every day is bright. 🤓

software development