From 42d8064cab2bf9458ca0b58b79ef584206c7585b Mon Sep 17 00:00:00 2001 From: Edward Shen Date: Sat, 1 May 2021 20:50:39 -0400 Subject: [PATCH] Partial work in ASEV --- src/notes/2021-03-27-a.mdx | 120 +++++++++++++++++++++++++++++++++++++ src/pages/index.mdx | 2 +- 2 files changed, 121 insertions(+), 1 deletion(-) create mode 100644 src/notes/2021-03-27-a.mdx diff --git a/src/notes/2021-03-27-a.mdx b/src/notes/2021-03-27-a.mdx new file mode 100644 index 0000000..5cbbc06 --- /dev/null +++ b/src/notes/2021-03-27-a.mdx @@ -0,0 +1,120 @@ +--- +path: "absolute-sev" +date: 2021-03-27 +title: "Absolute Site Event Score" +hidden: true +--- + +Given a Site Event (SEV) number `USERS.CLIENTS.INTERNAL`, the number: + +1. `USERS` is defined as the _per deci_ of users affected. +2. `CLIENTS` is defined as the _per deci_ of clients affected. +3. `INTERNAL` is defined as the _per deci_ of the internal system affected. + +Where _per deci_ is defined as a percent divided by ten and rounded up. +Additional labels for noteworthy "named" events are available as extensions to +the `USERS.CLIENT.INTERNAL` format. + +## Introduction + +explain problem here. + +As a solution to this problem, I propose a similar set of rules heavily inspired +by the [Semantic Versioning 2.0.0] document. Engineers and administration may +use this number as way to signal and convey priority and importance to each +event without necessarily knowing the specifics of each event. This permits +ordering for review and provides a short-code for engineers to quickly +acknowledge the scope of the situation. + +This system, despite being closely modeled after Semver, intends to provide an +absolute metric to compare site events, regardless of scale of the site. As a +result, I call this system "Absolute SEV Score". + +[Semantic Versioning 2.0.0]: https://semver.org/spec/v2.0.0.HTML + +## Absolute Site Event Score (ASEV) + +The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, +`SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be +interpreted as described in [RFC 2119]. + +The term _per deci_ is defined as the percent of the affected element divided by +ten and rounded to the nearest integer. For example, if 1%, 3.643%, 10%, and +so forth of users are affected, then the _per deci_ value is defined as 1. If +50.000001% of internal systems are affected, then the _per deci_ value is 6. + +1. A normal ASEV number `MUST` take the form A.B.C, where A, B, and C are in +either in the range of 0-10, inclusive and `MUST` NOT contain leading zeros, or +of the values `U`, `IP`, or `X`. A is the `USERS` number, B is the `CLIENTS` +number, and C is the `INTERNAL` number. +2. The `USERS` number `MUST` represent the _per deci_ of people (e.g. end users) +affected by the ASEV. +3. The `CLIENTS` number `MUST` represent the _per deci_ of clients that this +ASEV affects. This value `MUST` be defined as an unweighted total _per deci_ of +clients affected. +4. The `INTERNAL` number `MUST` represent the _per deci_ of internal systems +that this ASEV affects. The value `MUST` be defined as an unweighted total +_per deci_ of internal systems affected. +5. Elements with the value `U` `MUST` transition to the value `IP`, `X`, or a +numeric value. Elements with the value `IP` `MUST` transition to the value `X` +or a numeric value. Numeric values `SHOULD` represent the live _per deci_ value +of the target affected. It is `RECOMMENDED` that an additional ASEV number be +kept that the maximum numeric values of every element is recorded. +6. Elements with the value `U`, `IP`, and `X` represent the state of "Unknown", +"In Progress", and "No data", respectively. These values denote that the true +impact of the ASEV has not been evaluated yet, or in the case of `X`, cannot be +evaluted. +7. ASEV numbers `MAY` be marked with a label by appending a hyphen and a series +of dot separated identifiers following the `INTERNAL` number. Identifiers `MUST` +comprise only ASCII alphanumerics and hyphens `[0-9A-Za-z-]`. Identifiers +`MUST NOT` be empty. Numeric identifiers `MAY` include leading zeros. It is +`RECOMMENDED` to give high impact ASEVs. It is `RECOMMENDED` to have a unique +identifier as a machine identification string. Examples: `3.2.5-spinlock.2314`, +`5.3.9-4.20.2021`, `5.4.3-911-called.4fajh5z`. +8. ASEV numbers in the format `A.B.C-name.id` are in "Named Normal form" +and numbers in the format `A.B.C-id` are known in "Normal form" if the `id` +identifier is unique within the organization, and all ASEV values within the +organization are in either Named Normal or Normal form. It is `RECOMMENDED` +organizations use a combination of Named Normal or Normal form ASEV numbers. +9. The precedence order for element values are as followed: `X`, `IP`, `U`, and +followed by 10 to 0, and is determined by the first left-to-right difference. +It is `RECOMMENDED` that if all ASEVs are in either Normal or in Named Normal +forms that Named Normal numbers have precedence over Normal numbers. + + +[RFC 2119]: https://tools.ietf.org/html/rfc2119 + +## Backus-Naur Form Grammer for Valid Absolute Site Event Scores + +todo + +## Why use Absolute Site Event Scores? + +#### Why can any value decrease over time? + +The ASEV value is intended to represent the impact of a dynamically changing +situation. Values + +#### Why can't a numerical value return to an alphabetical value? + +The alphabetical values are to convey the ability to monitor the situation. +During a site event, new events with unknown impact may be denoted as `U.U.U`, +indicating that personnel are needed to evaluate the situation. Likewise, an +`X.X.X` score represents an event with no way to measure the impact. Once these +values can be measured, they should be promoted to values that represent the +estimated impact. If a metric was incorrect, then the value can be reflected +with the new value. + +#### Why do alphabetical values have higher values over numbered values? + +The inability to measure impact in due time or at all is a serious concern for +long term stability of a service or platform. As a result, these values take +higher precedence than other events with known measured values for two reasons. +One, it is possible that the impact may be as severe or more severe than +previous cases, so it is best to perform analysis as soon as possible. Two, the +inability to detect and measure this data implies a systematic failure to +prevent this incident from occurring in the future. + +## License + +[Creative Commons—CC BY 3.0](https://creativecommons.org/licenses/by/3.0/) diff --git a/src/pages/index.mdx b/src/pages/index.mdx index 8218db0..d22af04 100644 --- a/src/pages/index.mdx +++ b/src/pages/index.mdx @@ -13,7 +13,7 @@ import Navbar from '../components/navbar'; ----- Hey there. I'm just an software engineer with varying degrees of interest in -distributed systems, safety and correctness, cybersecurity, homelabbing, +distributed systems, safety and correctness, cybersecurity, performance, homebrewing (both kinds!), and keyboards. I like Rust! #### Contact Info