Non-functional requirements

A list of non-functional requirements for recall

After reading books and articles, watching videos, etc. I tried to summarize a list of non-functional requirements that is more important to LOB apps

The resources:

A list of non-functional requirements for LOB apps

Data volume

  • How much data volume the system will acquire over time?
  • What data is required on Day 1?
  • How the data grows each month/year?
  • Is this going to influence what database to use?
  • Is this going to influence the design of your queries?
  • Is this going to influence your storage capacity and network planning?

Latency [in milliseconds]

  • How much time does it take to get/insert data from API? This is the latency in milliseconds.
  • How much time does it take to read/write a file?
  • What is the required response time?

Throughput [X requests/second]

  • How many tasks can be performed in a given unit of time?
  • How many files can be read in a second?
  • How many users can save data in the database simultaneously?

Load [600 simultaneous without crashing]

  • Quantity of simultaneous work before crashing
  • How many concurrent requests can be processed before the system starts crashing?

Concurrent users

  • How many users will use the system simultaneously?
    Concurrent Users => Including “Dead time”
    Load => Excluding “Dead time” (actual requests/second)
    Rule of thumb => Concurrent = Load X 10


  • On app level
  • On platform level


  • What is the required uptime?
  • Manage client expectations. 99.99% is not realistic
  • How do you detect SLA failure? What is the reporting mechanism?


  • How much compute/storage can be added without interruption?
  • Keep components stateless or add stateful sidecar
  • How many redundant instances per type of service should be there?


  • Redundancy (to enable resilience during internal system failures)
    • Db + replication
    • Web backend
    • Message Broker
    • Workers
  • Patterns for reliable message processing
  • Persistent queues – to avoid overloading/provide backpressure
  • Retry strategies, timeouts, fallbacks to make the system predictable
  • Idempotency – to allow retry requests without worries of double records. Idempotency keys.


  • System failures
  • Poison messages
  • Functional defects


  • To know what is going on in the system in be informed on time to take action
  • Monitoring agents on platform and app level
  • Alerting


  • What do we use for observability and high-cardinality logs?
  • What do we do about logging? Platform, app, component, logs-metrics-traces?


  • Extend functionality without modifying existing code base or downtime


  • What needs to be backed up?
  • How much data we can afford to lose?
  • Deltas of full backups?
  • How do we plan for encryption malware?

Disaster recovery

  • What time to recover do we need/aim for?
  • Amount of data lost on recover / Snapshot frequency

Configuration management

  • Dynamic change of configuration
  • Security of configuration
  • Process to change configuration
  • Capability to roll-back configuration

Deployment model and topology

  • Tech stack for the CICD
  • Time to roll-out / roll-back
  • Ability for A/B or canary deployments
  • How do we rollout database model changes?


  • mean time to diagnose and fix a problem
  • what are the applicable KPIs and alerts


  • What do we test?
  • How do we test?
  • How do we keep tests current and avoid bad practices?
  • CI/CD integration of the tests


  • Capability to change a component while abiding to the interface


  • How to scan for prohibited licenses in dependencies?
  • How to stop licenses from proliferating in the pipelines?


  • Interface Contract Management for external system integrations
  • OpenAPI specs


  • The next worst thing after naming conventions 😀
  • Where do we cache data?
  • How do you test performance?
  • Are you prematurely inserting caching into the system?


Not an extensive list but when consulted works well on memory recall