The following document is a hypothetical design provided to illustrate the process of security reviewing design documents. Intended as a learning tool, it omits many details that would be present in a real design, focusing instead on security aspects. As such, it is not a complete example of a real software design document.

Italic text is intended as meta-description about this design document. I use it to remark on the document’s pedagogical purpose and explain shortcuts I’ve taken. I use bold text to highlight security-related content: examples of good security practice in a design, what features a good designer adds, or points that security reviewers should be raising.

Title – Private Data Logging Component Design Document

Section 1 – Product Description
Section 2 – Overview
Section 3 – Use Cases
Section 4 – System Architecture
Section 5 – Data Design
Section 6 – API
Section 7 – User Interface Design
Section 8 – Technical Design
Section 9 – Configuration
Section 10 – References
END OF DOCUMENT

Section 1 – Product Description

This document describes a logging component (herein called Logger) that provides standard software event logging facilities to support auditing, system monitoring, and debugging, designed to mitigate risks of inadvertent information disclosure. Logger will explicitly handle private data within logs so that non-private data can be freely accessed for routine uses. In rare cases when this access level is insufficient, limited access to protected, private log data can be provided, subject to explicit approval and with restrictions to eminimize potential exposure.

The notion of explicitly handling private data separately within the context of a logging system is an example of security-centric design thinking. Adding this feature to an existing system would be less efficient and require considerable code churn*, compared to designing it in from the start.*

Section 2 – Overview

For baseline project design assumptions, see the documents listed in Section 10.

2.1 Purpose

All applications in the datacenter need to log details of important software events, and since these logs potentially contain private data, careful access control needs to be enforced. Logger provides standard components to generate logs, store logs, and enforce appropriate access to authorized staff while maintaining a reliable and non-repudiable record of what access does occur. Since the logging, access, and retention requirements of systems vary, Logger operates based on a simple policy configuration that specifies an access policy.

2.2 Scope

This document explains the design of the software components of Logger without mandating the choice of implementation language, deployment, or operational considerations.

2.3 Concepts

The notion of a filtered view of logs is core to the design. The idea is to allow relatively free inspection of the logs with any private details filtered out, an access level which should suffice for most uses. Additionally, when needed, sensitive data that is logged can be inspected, subject to additional authorization. The access event is logged too, making the fact of inspection auditable. This graduated access lets applications log important private details while still minimizing how that data is exposed to legitimate uses by internal staff. Data so sensitive that it should never appear in logs simply should not be logged in the first place.

For example, web applications routinely log HTTPS requests as a record of system usage and for many other reasons. Often these logs contain private information (including IP addresses, cookies, and much more) that must be captured but is rarely needed. For example, IP addresses are useful when investigating malicious attacks (to identify the origin of an attack), but for other uses are immaterial. A filtered view of logs hides, or “wraps,” private data while showing nonsensitive data. Designated pseudonyms in a filtered view can show that, for instance, the IP addresses of all events labeled “IP7” are identical without disclosing the actual address. Often such a filtered view provides sufficient information for the purposes of monitoring, gathering statistics, or debugging, and when that is the case it’s advantageous to have avoided exposing any private data at all. The logs still contain the full data, and in the rare cases when the protected information is required the unfiltered view is available in a controlled manner with proper authorization.

Suppose that a web application receives a user login attempt which triggers a bug that causes the process to crash. Here is a simplified example of what the log might contain:

2018/06/23 08:09:10 66.77.88.99 POST login.htm {user: "SAM", password: ">1<}2{]3[\\4/"}

The items in this log are: timestamp (not sensitive), IP address (sensitive), HTTP verb and URL (not sensitive), username (sensitive), password (very sensitive). An investigation potentially needs to consider all this information in order to reproduce the bug, but you don’t want to display this data in plaintext unless absolutely necessary, and then only to authorized agents.

To address the security needs of a wide range of systems, the sensitivity of various kinds of log data should be configurable, and the logging system should only selectively reveal confidential data. For example, as a best practice URLs should not contain sensitive information, but a legacy system might be known to violate this rule of thumb and require protection not usually necessary—which makes the filtered view less useful for some debugging. In the case of a URL, regular expressions could facilitate configuring certain URLs as more sensitive than others.

A filtered view of the previous example log that omits or wraps the sensitive data might look like this:

The IP address, username, and password are all wrapped as identifiers to hide the data, but the substituted identifiers could be used in context to query other requests with matching values. In this example, US1 designates an IP address in the US; USER1 designates the username associated with the event without divulging it specifically; and PW1 stands for the password submitted. The suffixes in parentheses indicate the format or length of the actual data, adding a hint without revealing specific details: we can see that it’s an IPv4 address, the username has 3 characters, and the password has 12. For example, if an excessively long password caused a problem, this fact would be apparent from its surprising length alone. Knowing the length of the password leaks a little information but should not be compromising in practice.

When the filtered view is insufficient for the task at hand, an additional request to unwrap an identifier such as US1 can be made. This makes seeing the sensitive data an explicit choice, and allows a graduated revealing of data. For example, if only the IP address is needed, the username and password values remain undisclosed.

2.4 Requirements

Logs are reliably stored, immediately accessible with authorization, and destroyed after the required retention period. To support high volumes of use, the log capture interface must be fast, and once it reports success, the generating application may rightly assume that the log is stored.

Logs can be monitored without knowledge of private details, so a filtered log view can be made widely available for most uses, with special authorization needed to see the full data (including private data) only when strictly necessary.

The logs system is used exclusively by technical staff and so all users are expected to be capable of understanding a complex interface and the workings of software-generated logs.

An important goal of this design is to allow logging of very sensitive private data that can be made available for investigation of possible security incidents or, in rare cases, debugging of issues that only occur in production. Complete mitigation against an insider attack is an impractical goal, but it’s important to take all reasonable precautions and preserve a strong audit trail as a deterrent.

Storage for logs is encrypted to protect against leaks if the physical media is stolen.

Software generating logs is fully trusted; it must correctly identify private data in order for Logger to handle it correctly.

2.5 Non-Goals

As Logger is intended for use by admins, a slick UI is unnecessary.

Insider attacks such as code tampering or abuse of admin root privilege are out of scope.

To be effective, Logger requires careful configuration and oversight. How this is implemented must be defined by system management but should include a review process and auditing with checks and balances.

2.6 Outstanding Issues

Details of log access configuration, user authentication, and grants of unfiltered access authorization remain to be specified.

Query of encrypted private data is inherently slow. This design envisions that log data volumes are sufficiently small that a brute-force pass (that is, without reliance on an index) decrypting records on demand will be performant. A more ambitious future version might tackle indexing and fast querying over encrypted data.

Error cases need to be identified and handling specified.

Enhancements for future versions of Logger to consider include:

Defining levels of filtered views that provide more or less detailed information
Providing a facility to capture portions of the log for long-term secure storage that would eventually be routinely deleted

2.7 Alternative Designs

The final design chosen is based on fully trusting Logger to store all sensitive information in logs, putting “all eggs in one basket.” An alternative was considered that allowed sensitive information to be compartmentalized by source. This was not pursued for a few reasons (sketched following) that did not appear compatible with important use scenarios, but it is important to note that this would arguably be a more secure logging solution.

Alternative design

Log sources would create an asymmetric cryptographic key pair and use it to encrypt the sensitive data portions of log records before sending to Logger. If this were done carefully, Logger could (probably) still generate pseudonyms for filtered views (for example, US1 for a certain IP address in the US). Authorized access to unfiltered views would then require the private key in order to decrypt the data. The main advantage of this approach is that disclosure of stored log data would not leak sensitive data that was encrypted, and Logger would not even have the necessary key(s).

Reasons not chosen

This design puts the burden of encryption and key management on both log sources and authorized accessors. The designation of what data is sensitive and how it should be partitioned is determined by the log source and fixed at that time. By centralizing trust in Logger, both of these aspects can be reconfigured as needed, and fine-grained access can be controlled by authenticating the log viewer.

Section 3 – Use Cases

Applications in the datacenter generate logs of important software events using Logger. Routine monitoring software and appropriate operational staff are allowed filtered access (data views without disclosure of any private data) for their routine duties. Operational statistics including traffic levels, active users, error rates, and so forth are all generated from filtered log views.

Rarely, when support or debugging requires access to the unfiltered logs, authorized staff may get limited access subject to policy. Access requests specify the subset of logs needed, their time window, and the reason for the access. Once approved, a token is issued that permits the access, which is logged for audit. Upon completion, the requester adds a note describing the result of the investigation, which is reviewed by the approver to ensure propriety.

Reports detailing summaries of requests, approvals, and audit reviews, as well as log volume trends and confirmation of expired log data deletion, are generated to inform management.

Section 4 – System Architecture

Within the datacenter, Logger service instances run on physically separate machines operated independently from the applications they serve, via a standard publish/subscribe protocol. Logger is constituted from three new services organized as the following functions:

Logger Recorder

A log storage service. Applications stream log event data over an encrypted channel to the Logger Recorder service, where they are written to persistent storage. One instance may be configured to handle logs for more than one application.

Logger Viewer

A web application that technical staff uses to manually inspect filtered logs, with the ability to reveal unfiltered views subject to authorization according to policy. Subject to access policy, Logger Viewer provides access to filtered and (with authorization) unfiltered logs.

Logger Root Recorder

A special instance of Logger Recorder that logs events of Logger Recorder and Viewer. For simplicity we omit the details of filtered and unfiltered views of this log.

Section 5 – Data Design

Log data is collected directly from applications that determine what events, with what details, should be logged. Logs are append-only records of software events, and are never modified other than being deleted upon expiration.

Applications define a schema of log event types, with zero or more items of preconfigured data, as illustrated by the following example. All log events must have a timestamp and at least one other identifying data item.

{LogTypes: [login, logout, . . .]}
{LogType: login, timestamp: time, IP: IPaddress, verb: string,
 URL: string, user: string, password: string, cookies: string}
{LogType: logout, timestamp: time, IP: IPaddress, verb: string,
 URL: string, user: string, cookies: string}
{Filters: {timestamp: minute, IP: country, verb: 0, URL: 0,
 user: private, password: pw_mask, cookies: private}}

Many details regarding built-in types, formatting, and so forth are omitted since the basic idea of how these would be defined should be clear from this partial example.

Requests and responses must be UTF-8-encoded valid JSON expressions less than 1 million characters in length. Individual field values are limited to at most 10,000 characters.

The first line (LogTypes) enumerates the types of log events this application will produce. For each type, a JSON record with the corresponding LogType entry (the second line is for LogType: login) lists the allowable data items that may be provided with such a log.

The fourth line (Filters) declares the disposition of each data item: 0 for nonsensitive data, private for private data to be “wrapped,” and other special types of data handling, including:

– minute — Time value is rounded to the nearest minute (obscuring precise times)

– country — IP addresses are mapped to country of origin in the filtered view

Filters should be defined by pluggable components and easily extended to support custom data types that various applications will require.

Note that “nonsensitive” data should be used for limited internal viewing only; this designation does not mean that this data should be publicly disclosed. The requirement that all data items be declared, including disposition (private or not), is to ensure that explicit decisions are made about each one in the context of the application. It is critical that these definitions and any updates have careful scrutiny to ensure the integrity of the log processing.

Here is an example log entry in the unfiltered view for this schema:

2018/06/23 08:09:10 66.77.88.99 POST login.html {user: "SAM", password: ">1<}2{]3[\\4/"}

And this is the corresponding filtered view:

2018/06/23 08:09 US1(v4) POST login.html {user: USER1(3), password: PW1(12)}

Data is stored persistently and available until the policy-configured expiration date is reached, measured as time elapsed since the event log timestamp.

Logs are transient data only intended for monitoring and debugging or for forensic purposes in the case of a security breach, and as such are only kept for a limited time. Potential data loss is mitigated by storing the data on a dedicated machine, using a redundant RAID (or similar) disk array for redundant persistent storage. Logs are intended as short-term storage for auditing and diagnostic purposes. Long-term storage of any of this data should be created separately, derived from the logs, and encrypted to secure access control.

Section 6 – API

The Logger Recorder’s network interface accepts the following remote procedure calls:

Hello — Must be the first API call of the session; identifies the application and version

Schema — Defines the log data schema (see Section 5)

Log — Sends event data (see Section 5) to be recorded to the specified log

Goodbye — Sent when the application terminates, ending the session

Each application connects to its logging service via a dedicated channel. HTTPS secures API invocations between authenticated endpoints; the preconfigured server name authenticates (by its digital certificate) that clients are connected to valid Logger service instances. The following are the request types.

6.1 Hello Request

Any process that will use the Logger service sends this request to initiate the logging:

{"verb": "Hello", "source": "Sample application", "version": "1"}

The following response acknowledges the request with an OK or error and provides a string token for the session:

{"status": "OK", "service": "Logger", "version": "1", "token": "XYZ123"}

The token is used in subsequent requests to identify the context of the initiating application corresponding to the Hello, allowing multiple applications to log over a single connection. Tokens are generated randomly with sufficient complexity and entropy to preclude guessing: the minimum recommended token size is 120 bits, or about 20 characters in base64 encoding. Shorter tokens are used here for brevity.

6.2 Schema Definition Request

This request defines the data schema for subsequent logging, as described in Section 5:

{"verb": "Schema", "token": "XYZ123", ...}

Details of this request are omitted for brevity.

The schema defines the field names, types, and other attributes that will appear in the log contents, as illustrated by the sample event log request shown in the following section (which includes the fields timestamp, ipaddr, http, url, and error).

6.3 Event Log Request

This request actually logs one record with the Logger service:

{"verb": "Event", "token": "XYZ123", "log":
 {"timestamp": 1234567890, "ipaddr": "12.34.56.78", 
  "http": "POST", "url": "example", "error": "404"}}

The log JSON presents content to be recorded to the log that must match the schema.

The response acknowledges the request with an OK or error:

{"status": "OK"}

Error details are omitted for brevity. Logging errors (for example, insufficient storage space) are serious and require immediate attention, since system operation is not auditable in the absence of logging.

6.4 Goodbye Request

This request completes a session of logging:

{"verb": "Goodbye", "token": "XYZ123"}

The response acknowledges the request with an OK or error:

{"status": "OK"}

The token thereafter is no longer valid. To resume logging, the client must first make a Hello request.

Section 7 – User Interface Design

The user interface to the Logger is a web interface served by Logger Viewer that is used to examine the logs. The web app is only accessible by authorized operations staff and authenticated by enterprise single sign-on. Authenticated users see a selection of logs they are allowed to access, with links to browse or search the most recent filtered log entries or, when allowed, to request access to unfiltered logs subject to approval.

For brevity, only a high-level description of the web interface is provided for this example.

Approval requests are queued for processing in a web form that provides basic information:

The reason access is requested, including specifics such as customer issue ticket numbers
The scope of access requested (typically a specific user account or IP address)

Approval requests trigger automated email to approvers with a link to the web app page to review these requests. When the decision is taken an email notifies the requester with the following:

An approval or denial
Reason for denial, if applicable
Time window for approved access

Filtered and unfiltered logs are visible on a page corresponding to each log. Queries may be entered specifying which log entries to view. An empty query shows the most recent entries with Next/Previous links for paging through the results.

Queries specify log entry fields and values, combined with Boolean operators to select matching log entries. Most recent first is the default order, unless an explicit ordering is given in the query. For brevity, the details of query syntax are omitted.

Filtered logs are displayed with symbolic identifiers (see Section 2.3 for an example) instead of the raw log contents. Queries may use symbolic identifiers present in filtered log content; for example, if a filtered log entry shows IP address US1, a query of \[IP = US1\] would find other logs from that IP address without disclosing the address itself.

Queries over filtered logs must disallow searches on filtered fields with exact values. For example, even if IP addresses are not shown, if the user can guess [IP = 1.1.1.1] (and so forth) they may eventually hit a log entry that will show it as something like USA888 and then be able to infer the actual value.

Even when unfiltered access is approved, users must select an option to begin unfiltered viewing and querying. Best practice maximizes use of filtered logs, only revealing filtered values on an as-needed basis, and it is important that the user interface encourage this.

Users can renounce the right to unfiltered log access when the task is completed. The user interface should promote this after a period of inactivity to minimize risk of unnecessary access.

Web pages displaying log contents should not be locally cached by user agents to avoid inadvertent disclosure and to ensure that, on expiry, the log data is no longer available.

Section 8 – Technical Design

The Logger Recorder service consists of a write-only interface for applications to stream log event data that will be written to persistent storage, and a query interface to get views of those logs. Storage is a sequence of write-append files consisting of UTF-8 lines of text, with one line per log event. Log data as described by the relevant schema (see above) maps to/from a canonical representation as text. Details of formatting are omitted for this example.

Log data fields subject to filtering should be stored in the filtered representation in addition to the raw data encrypted with an AES key generated by the service, using a new key every day. Use a hardware key storage or suitable means of securely protecting these keys.

Since exhausting available storage represents a fatal error for a logging service, the write rate is measured against free space (free_storage_MB /avg_logging_MB_per_hour) and a priority operational alert is raised if space for fewer than 10 hours of data, assuming constant write volumes, remains (this number of hours to alert is configurable).

For performance, consider a SQL database recording filtered log event information (timestamp, log type, filename, and offset), supplementing the actual log files for efficient access.

Filtered logs hide private data with symbolic identifiers (for example, US1 for an IP address in the US). To avoid storing unfiltered private data, these maps go from a secure digest of the unfiltered data value to the filtered moniker. This mapping is temporary and maintained by Logger Viewer separately for each user context per log. Users have the ability to clear mappings for a fresh start, or after 24 hours of non-use they are automatically cleared to prevent useless buildup over time.

Logger Root Recorder is a special instance of Logger Recorder that logs the actions of Logger itself.

Section 9 – Configuration

Log retention is configured as follows. Data is automatically deleted after its retention period. Data is securely and permanently deleted beyond the retention period (not just moved to trash; use the shred(1) command or similar).

Retention: {
  "Log1": {"days": 10},
  "Log2": {"hours": 24},
}

Log access is granted by configuring lists of authorized users:

Access: {
  "Log1": {"filtered": ["u1", "u2", "u3", . . .],
  "unfiltered": ["x1", "x2", "x3", . . .],
  "approval": ["a1", "a2", "a3", . . .]},
}

Users allowed filtered access to the log denoted Log1 are listed within brackets, as shown above (for example, u1, u2, u3). Users permitted unfiltered access are then similarly listed. These users will be granted access only following an approved request. Finally, users with the power to grant approval for limited unfiltered access are listed in the same manner.

Section 10 – References

The following documents are useful for understanding this design document.

These are fictional.

Enterprise baseline design assumptions document (referenced in Section 2)
Enterprise general data protection policy and guidelines
Publish/subscribe protocol design document (referenced in Section 4)

Sample Design Document