Benchmarking Working Group M. Konstantynowicz
Internet-Draft V. Polak
Intended status: Informational Cisco Systems
Expires: 11 January 2024 10 July 2023
Multiple Loss Ratio Search
draft-ietf-bmwg-mlrsearch-04
Abstract
This document proposes improvements to [RFC2544] throughput search by
defining a new methodology called Multiple Loss Ratio search
(MLRsearch). The main objectives for MLRsearch are to minimize the
total test duration, search for multiple loss ratios and improve
results repeatibility and comparability.
The main motivation behind MLRsearch is the new set of challenges and
requirements posed by testing Network Function Virtualization (NFV)
systems and other software based network data planes.
MLRsearch offers several ways to address these challenges, giving
user configuration options to select their preferred way.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 11 January 2024.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
Konstantynowicz & Polak Expires 11 January 2024 [Page 1]
Internet-Draft MLRsearch July 2023
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Purpose and Scope . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. General notions . . . . . . . . . . . . . . . . . . . . . 4
2.1.1. General and specific quantities . . . . . . . . . . . 4
2.1.2. Composite . . . . . . . . . . . . . . . . . . . . . . 5
2.1.3. SUT . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.4. Trial . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.5. Load . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.6. Duration . . . . . . . . . . . . . . . . . . . . . . 6
2.1.7. Duration sum . . . . . . . . . . . . . . . . . . . . 7
2.1.8. Width . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.9. Loss ratio . . . . . . . . . . . . . . . . . . . . . 8
2.1.10. Exceed ratio . . . . . . . . . . . . . . . . . . . . 8
2.2. Architecture . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1. Manager . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2. Measurer . . . . . . . . . . . . . . . . . . . . . . 9
2.2.3. Controller . . . . . . . . . . . . . . . . . . . . . 11
2.2.4. Controller input . . . . . . . . . . . . . . . . . . 12
2.2.5. Controller internals . . . . . . . . . . . . . . . . 15
2.2.6. Controller output . . . . . . . . . . . . . . . . . . 26
3. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1. Long Test Duration . . . . . . . . . . . . . . . . . . . 26
3.2. DUT within SUT . . . . . . . . . . . . . . . . . . . . . 27
3.3. Repeatability and Comparability . . . . . . . . . . . . . 28
3.4. Throughput with Non-Zero Loss . . . . . . . . . . . . . . 29
3.5. Inconsistent Trial Results . . . . . . . . . . . . . . . 30
4. How the problems are addressed . . . . . . . . . . . . . . . 30
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31
6. Security Considerations . . . . . . . . . . . . . . . . . . . 31
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 32
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 32
8.1. Normative References . . . . . . . . . . . . . . . . . . 32
8.2. Informative References . . . . . . . . . . . . . . . . . 32
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 32
Konstantynowicz & Polak Expires 11 January 2024 [Page 2]
Internet-Draft MLRsearch July 2023
1. Purpose and Scope
The purpose of this document is to describe Multiple Loss Ratio
search (MLRsearch), a throughput search methodology optimized for
software DUTs.
Applying vanilla [RFC2544] throughput bisection to software DUTs
results in a number of problems:
* Binary search takes too long as most of trials are done far from
the eventually found throughput.
* The required final trial duration (and pauses between trials) also
prolong the overall search duration.
* Software DUTs show noisy trial results (noisy neighbor problem),
leading to big spread of possible discovered throughput values.
* Throughput requires loss of exactly zero packets, but the industry
frequently allows for small but non-zero losses.
* The definition of throughput is not clear when trial results are
inconsistent.
MLRsearch aims to address these problems by applying the following
set of enhancements:
* Allow searching for multiple search goals, with differing goal
loss ratios.
- Each trial result can affect any search goal in principle
(trial reuse).
* Multiple preceding targets for each search goal, earlier ones need
to spend less time on trials.
- Earlier targets also aim at lesser precision.
- Use Forwarding Rate (FR) at maximum offered load [RFC2285]
(section 3.6.2) to initialize the initial targets.
* Take care when dealing with inconsistent trial results.
- Loss ratios goals are handled in an order that minimizes the
chance of interference from later trials to earlier goals.
* Apply several load selection heuristics to save even more time by
trying hard to avoid unnecessarily narrow bounds.
Konstantynowicz & Polak Expires 11 January 2024 [Page 3]
Internet-Draft MLRsearch July 2023
MLRsearch configuration options are flexible enough to support both
conservative settings (unconditionally compliant with [RFC2544], but
longer search duration and worse repeatability) and aggressive
settings (shorter search duration and better repeatability but not
compliant with [RFC2544]).
No part of [RFC2544] is intended to be obsoleted by this document.
2. Terminology
When a subsection is defining a term, the first paragraph acts as a
definition. Other paragraphs are treated as a description, they
provide additional details without being needed to define the term.
Definitions should form a directed acyclic graph of dependencies. If
a section contains subsections, the section definition may depend on
the subsection definitions. Otherwise, any definition may depend on
preceding definitions. In other words, if the section definition
were to come after subsections, there would be no forward
dependencies for people reading just definitions from start to
finish.
Descriptions provide motivations and explanations, they frequently
reference terms defined only later. Motivations in section
descriptions are the reason why section text comes before subsection
text.
2.1. General notions
General notions are the terms defined in this section.
It is useful to define the following notions before delving into
MLRsearch architecture, as the notions appear in multiple places with
no place being special enough to host definition.
2.1.1. General and specific quantities
General quantity is a quantity that may appear multiple times in
MLRsearch specification, perhaps each time in a different role. The
quantity when appearing in a single role is called a specific
quantity.
It is useful to define the general quantity, so definitions of
specific quantities may refer to it. We say a specific quantity is
based on a general quantity, if the specific quantity definition
refers to and relies on the general quantity definition.
Konstantynowicz & Polak Expires 11 January 2024 [Page 4]
Internet-Draft MLRsearch July 2023
It is natural to name specific quantities by adding an adjective (or
a noun) to the name of the general quantity. But existing RFCs
typically explicitly define a term acting in a specific role, so the
RFC name directly refers to a specific quantity, while the
corresponding general quantity is defined only implicitly. Therefore
this documents defines general quantities explicitly, even if the
same term already appears in an RFC.
In practice, it is required to know which unit of measurement is used
to accompany a numeric value of each quantity. The choice of a
particular unit of measurement is not important for MLRsearch
specification though, so specific units mentioned in this document
are just examples or recommendations, not requirements.
When reporting, it is REQUIRED to state the units used.
2.1.2. Composite
A composite is a set of named attributes. Each attribute is either a
specific quantity or a composite.
MLRsearch specification frequently groups multiple specific
quantities into a composite. Description of such a composite brings
an insight to motivations why this or other terms are defined as they
are. Such insight will be harder to communicate with the specific
quantities alone.
Also, it simplifies naming of specific quantities, as they usually
can share a noun or adjective referring to their common composite.
Most of relations between composites and their specific quantities
can be described using plain English.
Perhaps the only exception involves referring to specific quantities
as attributes. For example if there is a composite called 'target',
and one of its specific quantities is 'target width' defined using a
general quantity 'width', we can say 'width is one of target
attributes'.
2.1.3. SUT
As defined in RFC 2285: The collective set of network devices to
which stimulus is offered as a single entity and response measured.
While RFC 2544 mostly refers to DUT as a single (network
interconnecting) device, section 19 makes it clear multiple DUTs can
be treated as a single system, so most of RFC 2544 also applies to
testing SUT.
Konstantynowicz & Polak Expires 11 January 2024 [Page 5]
Internet-Draft MLRsearch July 2023
MLRsearch specification only refers to SUT (not DUT), even if it
consists of just a single device.
2.1.4. Trial
A trial is the part of test described in RFC 2544 section 23.
When traffic has been sent and SUT response has been observed, we say
the trial has been performed, or the trial has been measured. Before
that happens, multiple possibilities for upcoming trial may be under
consideration.
2.1.5. Load
Intended, constant load for a trial, usually in frames per second.
Load is the general quantity implied by Constant Load of RFC 1242,
Data Rate of RFC 2544 and Intended Load of RFC 2285. All three
specify this value applies to one (input or output) interface, so we
can talk about unidirectional load also when bidirectional or multi-
port traffic is applied.
MLRsearch does not rely on this distinction, it works also if the
load values correspond to an aggregate rate (sum over all SUT tested
input or output interface unidirectional loads), as long as all loads
share the same semantics.
Several RFCs define useful quantities based on Offered Load (instead
of Intended Load), but MLRsearch specification works only with
(intended) load. Those useful quantities still serve as motivations
for few specific quantities used in MLRsearch specification.
MLRsearch assumes most load values are positive. For some (but not
all) specific quantities based on load, zero may also be a valid
value.
2.1.6. Duration
Intended duration of the traffic for a trial, usually in seconds.
This general quantity does not include any preparation nor waiting
described in section 23 of RFC 2544. Section 24 of RFC 2544 places
additional restrictions on duration, but those restriction apply only
to some of the specific quantities based on duration.
Duration is always positive in MLRsearch.
Konstantynowicz & Polak Expires 11 January 2024 [Page 6]
Internet-Draft MLRsearch July 2023
2.1.7. Duration sum
For a specific set of trials, this is the sum of their durations.
Some of specific quantities based on duration sum are derived
quantities, without a specific set of trials to sum their durations.
Duration sum is never negative in MLRsearch.
2.1.8. Width
General quantity defined for an ordered pair (lower and higher) of
load values, which describes a distance between the two values.
The motivation for the name comes from binary search. The binary
search tries to approximate an unknown value by repeatedly bisecting
an interval of possible values, until the interval becomes narrow
enough. Width of the interval is a specific quantity and the
termination condition compares that to another specific quantity
acting as the threshold. The threshold value does not have a
specific interval associated, but corresponds to a 'size' of the
compared interval. As size is a word already used in definition of
frame size, a more natural word describing interval is width.
The MLRsearch specification does use (analogues of) upper bound and
lower bound, but does not actually need to talk about intervals.
Still, the intervals are implicitly there, so width is the natural
name.
Actually, there are two popular options for defining width. Absolute
width is based on load, the value is the higher load minus the lower
load. Relative width is dimensionless, the value is the absolute
width divided by the higher load. As intended loads for trials are
positive, relative width is between 0.0 (including) and 1.0
(excluding).
Relative width as a threshold value may be useful for users who do
not presume what is the typical performance of SUT, but absolute
width may be a more familiar concept.
MLRsearch specification does not prescribe which width has to be
used, but widths MUST be either all absolute or all relative, and it
MUST be clear from report which option was used (it is implied from
the unit of measurement of any width value).
Konstantynowicz & Polak Expires 11 January 2024 [Page 7]
Internet-Draft MLRsearch July 2023
2.1.9. Loss ratio
The loss ratio is a general quantity, dimensionless floating point
value assumed to be between 0.0 and 1.0, both including. It is
computed as the number of frames forwarded by SUT, divided by the
number of frames that should have been forwarded during the trial.
If the number of frames that should have been forwarded is zero, the
loss ratio is considered to be zero (but it is better to use high
enough loads to prevent this).
Loss ratio is basically the same quantity as Frame Loss Rate of RFC
1242, just not expressed in percents.
RFC1242 Frame Loss Rate: Percentage of frames that should have been
forwarded by a network device under steady state (constant) load that
were not forwarded due to lack of resources.
(RFC2544 restricts Frame Loss Rate to a type of benchmark, for loads
100% of 'maximum rate', 90% and so on.)
2.1.10. Exceed ratio
This general quantity is a dimensionless floating point value,
defined using two duration sum quantities. One duration sum is
referred to as the good duration sum, the other is referred to as the
bad duration sum. The exceed ratio value is computed as the bad
duration sum value divided by the sum of the two sums. If both sums
are zero, the exceed ratio is undefined.
As there are no negative duration sums in MLRsearch, exceed ratio
values are between 0.0 and 1.0 (both including).
2.2. Architecture
MLRsearch architecture consists of three main components: the
manager, the controller and the measurer.
The search algorithm is implemented in the controller, and it is the
main focus of this document.
Most implementation details of the manager and the measurer are out
of scope of this document, except when describing how do those
components interface with the controller.
Konstantynowicz & Polak Expires 11 January 2024 [Page 8]
Internet-Draft MLRsearch July 2023
2.2.1. Manager
The manager is the component that initializes SUT, traffic generator
(called tester in RFC 2544), the measurer and the controller with
intended configurations. It then handles the execution to the
controller and receives its result.
Managers can range from simple CLI utilities to complex Continuous
Integration systems. From the controller point of view it is
important that no additional configuration (nor warmup) is needed for
SUT and the measurer to perform trials.
The interface between the manager and the controller is defined in
the controller section.
One execution of the controller is called a search. Some benchmarks
may execute multiple searches on the same SUT (for example when
confirming the performance is stable over time), but in this document
only one invocation is concerned (others may be understood as the
part of SUT preparation).
Creation of reports of appropriate format can also be understood as
the responsibility of the manager. This document places requirements
on which information has to be reported.
2.2.2. Measurer
The measurer is the component which performs one trial as described
in RFC 2544 section 23, when requested by the controller.
From the controller point of view, it is a function that accepts
trial input and returns trial output.
This is the only way the controller can interact with SUT. In
practice, the measurer has to do subtle decisions when converting the
observed SUT behavior into a single trial loss ratio value. For
example how to deal with out of order frames or duplicate frames.
On software implementation level, the measurer is a callable,
injected by the manager into the controller instance.
The act of performing one trial (act of turning trial input to trial
output) is called a measurement, or trial measurement. This way we
can talk about trials that were measured already and trials that are
merely planned (not measured yet).
Konstantynowicz & Polak Expires 11 January 2024 [Page 9]
Internet-Draft MLRsearch July 2023
2.2.2.1. Trial input
The load and duration to use in an upcoming trial.
This is a composite.
Other quantities needed by the measurer are assumed to be constant
and set up by the manager before search starts (see traffic profile),
so they do not count as trial input attributes.
2.2.2.1.1. Trial load
Trial load is the intended load for the trial.
This is a specific quantity based on load, directly corresponding to
RFC 2285 intended load.
2.2.2.1.2. Trial duration
Trial duration is the intended duration for the trial.
This is a specific quantity based on duration, so it specifies only
the traffic part of the trial, not the waiting parts.
2.2.2.2. Traffic profile
Any other configuration values needed by the measurer to perform a
trial.
The measurer needs both trial input and traffic profile to perform
the trial. As trial input contains the only values that vary during
one the search, traffic profile remains constant during the search.
Traffic profile when understood as a composite is REQUIRED by RFC
2544 to contain some specific quantities (for example frame size).
Several more specific quantities may be RECOMMENDED.
Depending on SUT configuration (e.g. when testing specific
protocols), additional values need to be included in the traffic
profile and in the test report. (See other IETF documents.)
2.2.2.3. Trial ouput
A composite consisting of trial loss ratio and trial forwarding rate.
Those are the only two specific quantities (among other quantities
possibly measured in the trial, for example offered load) that are
important for MLRsearch.
Konstantynowicz & Polak Expires 11 January 2024 [Page 10]
Internet-Draft MLRsearch July 2023
2.2.2.3.1. Trial loss ratio
Trial loss ratio is a specific quantity based on loss ratio. The
value is related to a particular measured trial, as measured by the
measurer.
2.2.2.3.2. Trial forwarding rate
Trial forwarding rate is a derived quantity. It is computed as one
minus trial loss ratio, that multiplied by trial load.
Despite the name, the general quantity this specific quantity
corresponds to is load (not rate). The name is inspired by RFC 2285,
which defines Forwarding Rate specific to one output interface.
As the definition of loss ratio is not neccessarily per-interface
(one of details left for the measurer), using the definition above
(instead of RFC 2285) makes sure trial forwarding rate is always
between zero and the trial load (both including).
2.2.2.4. Trial result
Trial result is a composite consisting of trial input attributes and
trial output attributes.
Those are all specific quantites related to a measured trial
MLRsearch needs.
While distinction between trial input and output is important when
defining the interface between the controller and the measurer, it is
easier to talk about trial result when describing how measured trials
influence the controller behavior.
2.2.3. Controller
The component of MLRsearch architecture that calls the measurer and
returns conditional throughputs to the manager.
This component implements the search algorithm, the main content of
this document.
Contrary to Throughput as defined in RFC 1242, the definition of
conditional throughput is quite sensitive to the controller input (as
provided by the manager), and its full definition needs several terms
which would otherwise be hidden as internals of the controller
implementation.
Konstantynowicz & Polak Expires 11 January 2024 [Page 11]
Internet-Draft MLRsearch July 2023
The ability of conditional throughput to be less sensitive to
performance variance, and the ability of the controller to find
conditional throughputs for multiple search goals within one search
(and in short overall search time) are strong enough motivations for
the need of increased complexity.
2.2.4. Controller input
A composite of max load, min load, and a set of search goals.
The search goals (as elements of the set of search goals) are usually
not named and unordered.
It is fine if all search goals of the set have the same value of a
particular attribute. In that case, the common value may be treated
as a global attribute (similarly to max and min load).
The set of search goals MUST NOT be empty. Two search goals within
the set MUST differ in at least one attribute. The manager MAY avoid
both issues by presenting empty report or de-duplicating the search
goals, but it is RECOMMENDED for the manager to raise an error to its
caller, as the two conditions suggest the test is improperly
configured.
2.2.4.1. Max load
Max load is a specific quantity based on load. No trial load is ever
higher than this value.
RFC 2544 section 20 defines maximum frame rate based on theoretical
maximum rate for the frame size on the media. RFC 2285 section 3.5.3
specifies Maximum offered load (MOL) which may be lower than maximum
frame rate. There may be other limitations preventing high loads,
for examples resources available to traffic generator.
The manager is expected to provide a value that is not greater than
any known limitation. Alternatively, the measurer is expected to
work at max load, possibly reporting as lost any frames that were not
able to leave Traffic Generator.
From the controller point of view, this is merely a global upper
limit for any trial load candidates.
2.2.4.2. Min load
Min load is a specific quantity based on load. No trial load is ever
lower than this value.
Konstantynowicz & Polak Expires 11 January 2024 [Page 12]
Internet-Draft MLRsearch July 2023
The motivation of this quantity is to prevent trials with too few
frames sent to SUT.
Also, practically if a SUT is able to reach only very small
forwarding rates (min load indirectly serves as a threshold for how
small), it may be considered faulty (or perhaps the test is
misconfigured).
2.2.4.3. Search goal
A composite of 7 attributes (see subsections).
If not otherwise specified, 'goal' always refers to a search goal in
this document.
The controller input may contain multiple search goals. The name
Multiple Loss Ratio search was created back when goal loss ratio was
the only attribute allowed to vary between goals.
Each goal will get its conditional throughput discovered and reported
at the end of the search.
The definitions of the 7 attributes are not very informative by
themselves. Their motivation (and naming) becomes more clear from
the impact they have on conditional throughput.
2.2.4.3.1. Goal loss ratio
A specific quantity based on loss ratio. A threshold value for trial
loss ratios. MUST be lower than one.
Trial loss ratio values will be compared to this value, a trial will
be considered bad if its loss ratio is higher than this.
For example, RFC 2544 throughput has goal loss ratio of zero, a trial
is bad once a sigle frame is lost.
Loss ratio of one would classify each trial as good (regardless of
loss), which is not useful.
2.2.4.3.2. Goal initial trial duration
A specific quantity based on duration. A threshold value for trial
durations. MUST be positive.
MLRsearch is allowed to use trials as short as this when focusing on
this goal. The conditional throughput may be influenced by shorter
trials, (measured when focusing on other search goals).
Konstantynowicz & Polak Expires 11 January 2024 [Page 13]
Internet-Draft MLRsearch July 2023
2.2.4.3.3. Goal final trial duration
A specific quantity based on duration. A threshold value for trial
durations. MUST be no smaller than goal initial trial duration.
MLRsearch is allowed to use trials as long as this when focusing on
this goal. If more data is needed, repeated trials at the same load
and duration are requested by the controller.
2.2.4.3.4. Goal min duration sum
A specific quantity based on duration sum. A threshold value for a
particular duration sum.
MLRsearch requires at least this amount of (effective) trials for a
particular load to become part of MLRsearch outputs.
It is possible (though maybe not prectical) for goal min duration sum
to be smaller than goal final trial duration.
In practice, the sum of durations actually spent on trial measurement
can be smaller (when trial results are quite one-sided) or even
larger (in presence of shorter-than-final trial duration results at
the same load).
If the sum of all (good and bad) long trials is at least this, and
there are no short trials, then the load is guaranteed to be
classified as either an upper or a lower bound.
In some cases, the classification is known sooner, when the 'missing'
trials cannot change the outcome.
When short trials are present, the logic is more complicated.
2.2.4.3.5. Goal exceed ratio
A specific quantity based on exceed ratio. A threshold value for
particulat sets of trials.
An attribute used for classifying loads into upper and lower bounds.
If the duration sum of all (current duration) trials is at least min
duration sum, and more than this percentage of the duration sum comes
from bad trials, this load is an upper bound.
If there are shorter duration trials, the logic is more complicated.
Konstantynowicz & Polak Expires 11 January 2024 [Page 14]
Internet-Draft MLRsearch July 2023
2.2.4.3.6. Goal width
A specific quantity based on width. A threshold value for a
particular width. MUST be positive.
This defines the exit condition for this search goal.
Relevant bounds (of the final target) need to be this close before
conditional throughput can be reported.
2.2.4.3.7. Preceding targets
A non-negative integer affecting the behavior of the controller.
How many additional non-final targets to add. Each next preceding
target has double width and min duration sum geometrically closer to
initial trial duration.
The usage of preceding targets is an important source of MLRsearch
time savings (compared to simpler search algorithms).
Having this value configurable lets the manager tweak the overall
search duration based on presumed knowledge of SUT performance
stability.
2.2.5. Controller internals
Terms not directly corresponding to the controller's input nor
output, but needed indirectly as dependencies of the conditional
throughput definition.
Following these definitions specifies virtually all of the controller
(MLRsearch algorithm) logic.
2.2.5.1. Pre-initial trials
Up to three special trials executed at the start of the search. The
first trial load is max load, subsequent trial load are computed from
preceding trial forwarding rate.
The main loop of the controller logic needs at least one trial
result, and time is saved if the trial results are close to future
conditional throughput values.
The exact way to compute load for second and third trial (and whether
even measure second or third trial) are not specified here, as the
implementation details have negligible effect on the reported
conditional throughput.
Konstantynowicz & Polak Expires 11 January 2024 [Page 15]
Internet-Draft MLRsearch July 2023
2.2.5.2. Search target
A composite of 5 specific quantites (see subsections). Frequently
called just target.
Similar to (but distinct from) the search goal.
Each search goal prescribes a final target, probably with a chain of
preceding targets.
More details in the Derived targets section.
2.2.5.2.1. Target loss ratio
Same as loss ratio of the corresponding goal.
2.2.5.2.2. Target exceed ratio
Same as exceed ratio of the corresponding goal.
2.2.5.2.3. Target width
Similar to goal width attribute. Doubled from goal width for each
level of preceding target.
2.2.5.2.4. Target min duration sum
Similar to goal min duration sum attribute. Geometrically
interpolated between initial target duration and goal min duration
sum.
2.2.5.2.5. Target trial duration
When MLRsearch focuses on this target, it measures trials with this
duration. The value is equal to the minimum of goal final trial
duration and target min duration sum.
Also, this value is used to classify trial results as short (if trial
duration is shorter than this) or long.
2.2.5.3. Derived targets
After receiving the set of search goals, MLRsearch internally derives
a set of search targets.
The derived targets can be seen as forming a chain, from initial
target to final target. The chain is linked by a reference from a
target to its preceding (towarsds initial) target.
Konstantynowicz & Polak Expires 11 January 2024 [Page 16]
Internet-Draft MLRsearch July 2023
The reference may be implemented as 6th attribute od target.
2.2.5.3.1. Final target
The final target is the target where the most of attribute values are
directly copied from the coresponding search goal. Final target
width is the same as goal width, final target trial duration is the
same as goal final trial duration, and final target min duration sum
is the same as the goal min duration sum.
The conditional throughput is found when focusing on the final
target. All non-final targets do not directly affect the conditional
throughput, they are there just as an optimization.
2.2.5.3.2. Preceding target
Each target may have a preceding target. Goal attribute Preceding
targets governs how many targets are created in addition to the final
target corresponding to the search goal.
Any preceding target has double width, meaning one balanced bisection
is needed to reduce preceding target width to the next target width.
Preceding target min duration sum is exponentially smaller, aiming
for prescribed initial target min duration sum.
Preceding target trial duration is either its min duration sum, or
the corresponding goal's final trial duration, whichever is smaller.
As the preceding min duration sum is shorter than the next duration
sum, MLRsearch is able to achieve the preceding target width sooner
(than with the next target min duration sum).
This way an approximation of the conditional throughput is found,
with the next target needing not as much time to improve the
approximation (compared to not starting with the approximation).
2.2.5.3.3. Initial target
Initial target is a target without any other target preceding it.
Initial target min duration sum is equal to the corresponding goal's
initial trial duration.
As a consequence, initial target trial duration is equal to its min
duration sum.
Konstantynowicz & Polak Expires 11 January 2024 [Page 17]
Internet-Draft MLRsearch July 2023
2.2.5.4. Trial classification
Any trial result can be classified according to any target along two
axes.
The two classifications are independent.
This classification is important for defining the conditional
throughput.
2.2.5.4.1. Short trial
If the (measured) trial duration is shorter than the target trial
duration, the trial is called long.
2.2.5.4.2. Long trial
If the (measured) trial duration is not shorter than the target trial
duration, the trial is called long.
2.2.5.4.3. Bad trial
If the (measured) trial loss ratio is larger than the target loss
ratio, the trial is called bad.
For example, if the target loss ratio is zero, a trial is bad as soon
as one frame was lost.
2.2.5.4.4. Good trial
If the (measured) trial loss ratio is not larger than the target loss
ratio, the trial is called good.
For example, if the target loss ratio is zero, a trial is good only
when there were no frames lost.
2.2.5.5. Load stat
A composite of 8 quantities (see subsections) The quantites depend on
a target and a load, and are computed from all trials measured at
that load so far.
The MLRsearch output is the conditional througput, which is a
specific quantity based on load. As MLRsearch may measure multiple
trials at the same load, and those trials may not have the same
duration, we need a way to classify a set of trial results at the
same load.
Konstantynowicz & Polak Expires 11 January 2024 [Page 18]
Internet-Draft MLRsearch July 2023
As the logic is not as straightforward as in other parts of MLRsearch
algorithm, it is best defined using the following derived quantities.
Load stat is the composite for one load and one target. Set of load
stats for one load an all targets is commonly called load stats.
2.2.5.5.1. Long good duration sum
Sum of durations of all long good trials (at this load, according to
this target).
2.2.5.5.2. Long bad duration sum
Sum of durations of all long bad trials (at this load, according to
this target).
2.2.5.5.3. Short good duration sum
Sum of durations of all short good trials (at this load, according to
this target).
2.2.5.5.4. Short bad duration sum
Sum of durations of all short bad trials (at this load, according to
this target).
2.2.5.5.5. Effective bad duration sum
One divided by tagret exceed ratio, that plus one. Short good
duration sum divided by that. Short bad duration sum minus that, or
zero if that would be negative. Long bad duration sum plus that is
the effective bad duration sum.
Effective bad duration sum is the long bad duration sum plus some
fraction of short bad duration sum. The fraction is between zero and
one (both possibly including).
If there are no short good trials, effective bad duration sum becomes
the duration sum of all bad trials (long or short).
If an exceed ratio computed from short good duration sum and short
bad duration sum is equal or smaller than the target exceed ratio,
effective bad duration sum is equal to just long bad duration sum.
Basically, short good trials can only lessen the impact of short bad
trials, while short bad trials directly contribute (unless lessened).
Konstantynowicz & Polak Expires 11 January 2024 [Page 19]
Internet-Draft MLRsearch July 2023
A typical example of why a goal needs higher final trial duration
than initial trial duration is when SUT is expected to have large
buffers, so a trial may be too short to see frame losses due to a
buffer becoming full. So a short good trial does not give strong
information. On the other hand, short bad trial is a strong hint SUT
would lose many frames at that load and long duration. But if there
is a mix of short bad and short good trials, MLRsearch should not
cherry-pick only the short bad ones.
The presented way of computing the effective bad duration sum aims to
be a fair treatment of short good trials.
If the target exceed ratio is zero, the given definition contains
positive infinty as an intermediate value, but still simplifies to a
finite result (long bad duration sum plus short bad duration sum).
2.2.5.5.6. Missing duration sum
The target min duration sum minus effective bad duration sum and
minus long good duration sum, or zero if that would be negative.
MLRsearch may need up to this duration sum of additional long trials
before classifing the load.
2.2.5.5.7. Optimistic exceed ratio
The specific quantity based on exceed ratio, where bad duration sum
is the effective bad duration sum, and good duration sum is the long
good duration sum plus the missing duration sum.
This is the value MLRsearch would compare to target exceed ratio
assuming all of the missing duration sum ends up consisting of long
good trials.
If there was a bad long trial, optimistic exceed ratio becomes larger
than zero. Additionally, if the target exceed ratio is zero,
optimistic exceed ratio becomes larger than zero even on one short
bad trial.
2.2.5.5.8. Pessimistic exceed ratio
The specific quantity based on exceed ratio, where bad duration sum
is the effective bad duration sum plus the missing duration sum, and
good duration sum is the long good duration sum.
This is the value MLRsearch would compare to target exceed ratio
assuming all of the missing duration sum ends up consisting of bad
good trials.
Konstantynowicz & Polak Expires 11 January 2024 [Page 20]
Internet-Draft MLRsearch July 2023
Note that if the missing duration sum is zero, optimistic exceed
ratio becomes equal to pessimistic exceed ratio.
This is the role target min duration sum has, it guarantees the two
load exceed ratios eventually become the same. Otherwise,
pessimistic exceed ratio is always bigger than the optimistic exceed
ratio.
Depending on trial results, the missing duration sum may not be large
enough to change optimistic (or pessimistic) exceed ratio to move to
the other side compared to target exceed ratio. In that case,
MLRsearch does not need to measure more trials at this load when
focusing on this target.
2.2.5.6. Target bounds
With respect to a target, some loads may be classified as upper or
lower bound, and some of the bounds are treated as relevant.
The subsequent parts of MLRsearch rely only on relevant bounds,
without the need to classify other loads.
2.2.5.6.1. Upper bound
A load is classified as an upper bound for a target, if and only if
both optimistic exceed ratio and pessimstic load exceed ratio are
larger than the target exceed ratio.
During the search, it is possible there is no upper bound, for
example because every measured load still has too high missing
duration sum.
If the target exceed ratio is zero, and the load has at least one bad
trial (short or long), the load becomes an upper bound.
2.2.5.6.2. Lower bound
A load is classified as a lower bound for a target, if and only if
both optimistic exceed ratio and pessimstic load exceed ratio are no
larger than the target exceed ratio.
During the search, it is possible there is no lower bound, for
example because every measured load still has too high missing
duration sum.
If the target exceed ratio is zero, all trials at the load of a lower
bound must be good trials (short or long).
Konstantynowicz & Polak Expires 11 January 2024 [Page 21]
Internet-Draft MLRsearch July 2023
Note that so far it is possible for a lower bound to be higher than
an upper bound.
2.2.5.6.3. Relevant upper bound
For a target, a load is the relevant upper bound, if and only if it
is an upper bound, and all other upper bounds are larger (as loads).
In some cases, the max load when classified as a lower bound is also
effectively treated as the relevant upper bound. (In that case both
relevant bounds are equal.)
If that happens for a final target at the end of the search, the
controller output may contain max load as the relevant upper bound
(even if the goal exceed ratio was not exceeded), signalling SUT
performs well even at max load.
If the target exceed ratio is zero, the relevant upper bound is the
smallest load where a bad trial (short or long) has been measured.
2.2.5.6.4. Relevant lower bound
For a target, a load is the relevant lower bound if two conditions
hold. Both optimistic exceed ratio and pessimstic load exceed ratio
are no larger than the target exceed ratio, and there is no smaller
load classified as an upper bound.
This is a second place where MLRsearch is not symmetric (the first
place was effective bad duration sum).
While it is not likely for a MLRsearch to find a smaller upper bound
and a larger load satisfying first condition for the lower bound, it
still may happen and MLRsearch has to deal with it. The second
condition makes sure the relevant lower bound is smaller than the
relevant upper bound.
In some cases, the min load when classified as an upper bound is also
effectively treated as the relevant lower bound. (In that case both
relevant bounds are equal.)
If that happens for a final target at the end of the search, the
controller output may contain min load as the relevant lower bound
even if the exceed ratio was 'overstepped', signalizing the SUT does
not even reach the minimal required performance.
The manager has to make sure this is distingushed in report from
cases where min rate is a legitimate conditional throughput (e.g. the
exceed ratio was not overstepped at the min load).
Konstantynowicz & Polak Expires 11 January 2024 [Page 22]
Internet-Draft MLRsearch July 2023
2.2.5.6.5. Relevant bounds
The pair of the relevant lower bound and the relevant upper bound.
Useful for determining the width of the relevant bounds. Any of the
bounds may be the effective one (max load or min load).
A goal is achieved (at the end of the search) when the final target's
relevant bounds have width no larger than the goal width.
2.2.5.7. Candidate selector
A stateful object (a finite state machine) focusing on a single
target, used to determine next trial input.
Initialized for a pair of targets: the current target and its
preceding target (if any).
Private state (not shared with other selectors) consists of mode and
flags. Public state (shared with all selectors) is the actual
relevant bounds for both targets (current and precedinig).
After accepting a trial result, each selector can nominate one
candidate (or no candidate) for the next trial measurement.
2.2.5.7.1. Current target
This is the target this selector tries to achieve.
2.2.5.7.2. Preceding target
The target (if any) preceding to the current target.
While this selector does not focus on the preceding target, the
relevant bounds for the preceding target are used as hints when the
current bound does not have enough of its relevant bounds.
2.2.5.7.3. Candidate
The trial input (if any) this selecor nominates.
The trial duration attribute is always the current target trial
duration. The trial load attribute depends on the selector state.
Candidates have defined ordering, to simplify finding the winner. If
load differs, the candidate with lower load is preferred. If load is
the same but duration differs, the candidate with larger duration is
preferred.
Konstantynowicz & Polak Expires 11 January 2024 [Page 23]
Internet-Draft MLRsearch July 2023
2.2.5.7.4. Selector mode
During its lifetime, selector proceeds through the following modes.
In order, but some modes may be skipped or revisited.
Each mode has its own strategy of determining the candidate load (if
any).
2.2.5.7.4.1. Waiting
Not enough relevant bounds (even for the preceding target). In this
mode, the selector abstains from nominating a candidate.
This selector leaves this mode when preceding target's selector is
done.
2.2.5.7.4.2. Halving
Candidate is in the middle of the relevant bounds of the preceding
target.
If the relevant bounds are narrow enough already, this mode is
skipped. As the preceding target had double width, just one halving
load needs to be measured.
Selector uses a flag to avoid re-entering this mode once it finished
measuring the halved load.
2.2.5.7.4.3. Upgrading
This mode activates when one relevant bound for the current target is
present and there is a matching relevant bound of the preceding
target within the current target width. Candidate is the load of the
matching bound from the preceding target.
At most one bound load is measured, depending on halving outcome.
Private flags are used to avoid upgrading at later times once
selector finished measuring the upgraded load.
2.2.5.7.4.4. Extending
Refined already but the other relevant bound for the current target
is still missing. Nominate new candidate according to external
search. Initial target selectors skip all previous modes.
A private value is used to track the width to be used in next load
extension (increasing geometrically). For initial target selectors,
the starting width may be chosen based on pre-initial trial results.
Konstantynowicz & Polak Expires 11 January 2024 [Page 24]
Internet-Draft MLRsearch July 2023
If both relevant bounds are present at the current load, but the
lower bound is far away (compared to tracked width), the candidate
from this mode is preferred (as long as the load is larger than the
candidate load of bisecting mode).
2.2.5.7.4.5. Bisecting
Both relevant bounds for the current target are available, but they
are too far from each other. Candidate is in the middle.
Contrary to halving, the candidate load does not need to be at the
exact middle. For example if the width of the current relevant
bounds is three times as large as the target width, it is
advantageous to split the interval in 1:2 ratio (choosing the lower
candidate load), as it can save one bisect.
2.2.5.7.4.6. Done
Both relevant bounds for the current target are available, the width
is no larger than the target width. No candidate.
If a selector reaches the done state, it is still possible later
trials invalidate its relevant lower bound (by proving a lower load
is in fact a new uper bound), making the selector transition into
extending or bisecting mode.
2.2.5.7.5. Active selector
Derived from a common goal, the earliest selector which nominates a
candidate is considered to be the active selector for this goal.
Candidates from other selectors of the same goal are ignored.
It is quite possible selectors focusing on other goals have already
found a lower bound relevant to multiple targets in a chain. In that
case, we want the most-initial of the target selectors (not already
in done mode) to have the nomination.
Otherwise (when in extending mode and missun relevant upper bound)
the closer-to-final selectors would nominate candidates at lower load
but at too high duration sum, preventing some of the time savings.
2.2.5.7.6. Winner
If the candidate previously nominated by a selector was the one that
got measured, the candidate is called a winner.
Konstantynowicz & Polak Expires 11 January 2024 [Page 25]
Internet-Draft MLRsearch July 2023
A selector observing its previous candidate was a winer can use
simplified logic when determining the mode, as it knows no other
selectors may have changed the relevant loads unexpectedly.
2.2.6. Controller output
The output object the controller returns to the manager is a mapping
assigning each search goal its conditional output (if it exists).
The controller MAY include more information (if manager accepts it),
for example load stat at relevant bounds.
There MAY be several ways how to communicate the fact a conditional
output does not exist (e.g. min load is classified as an upper
bound). The manager MUST NOT present min load as a conditional
output in that case.
If max load is a lower bound, it leads to a valid conditional output
value.
2.2.6.1. Conditional throughput
The conditional throughput is the average of trial forwarding rates
across long good trials measured at the (offered load classified as)
relevant lower bound (for the goal, at the end of the search). The
average is the weighted arithmetic mean, weighted by trial duration.
If the goal exceed ratio is zero, the definition of the relevant
bounds simplifies significantly. If additionally the goal loss ratio
is zero, and the goal min duration sum is equal to goal final trial
duration, conditional throughput becomes conditionally compliant with
RFC 2544 throughput. If the goal final trial duration is at least 60
seconds, the conditional througput becomes unconditionally compliant
with RFC 2544 throughput.
3. Problems
3.1. Long Test Duration
Emergence of software DUTs, with frequent software updates and a
number of different packet processing modes and configurations,
drives the requirement of continuous test execution and bringing down
the test execution time.
Konstantynowicz & Polak Expires 11 January 2024 [Page 26]
Internet-Draft MLRsearch July 2023
In the context of characterising particular DUT's network
performance, this calls for improving the time efficiency of
throughput search. A vanilla bisection (at 60sec trial duration for
unconditional [RFC2544] compliance) is slow, because most trials
spend time quite far from the eventual throughput.
[RFC2544] does not specify any stopping condition for throughput
search, so users can trade-off between search duration and achieved
precision. But, due to exponential behavior of bisection, small
improvement in search duration needs relatively big sacrifice in the
throughput precision.
3.2. DUT within SUT
[RFC2285] defines: - _DUT_ as - The network forwarding device to
which stimulus is offered and response measured [RFC2285] (section
3.1.1). - _SUT_ as - The collective set of network devices to which
stimulus is offered as a single entity and response measured
[RFC2285] (section 3.1.2).
[RFC2544] specifies a test setup with an external tester stimulating
the networking system, treating it either as a single DUT, or as a
system of devices, an SUT.
In case of software networking, the SUT consists of a software
program processing packets (device of interest, the DUT), running on
a server hardware and using operating system functions as
appropriate, with server hardware resources shared across all
programs and the operating system.
DUT is effectively "nested" within SUT.
Due to a shared multi-tenant nature of SUT, DUT is subject to
interference (noise) coming from the operating system and any other
software running on the same server. Some sources of noise can be
eliminated (e.g. by pinning DUT program threads to specific CPU cores
and isolating those cores to avoid context switching). But some
noise remains after all such reasonable precautions are applied.
This noise does negatively affect DUT's network performance. We
refer to it as an _SUT noise_.
DUT can also exhibit fluctuating performance itself, e.g. while
performing some "stop the world" internal stateful processing. In
many cases this may be an expected per-design behavior, as it would
be observable even in a hypothetical scenario where all sources of
SUT noise are eliminated. Such behavior affects trial results in a
way similar to SUT noise. We use _noise_ as a shorthand covering
both _DUT fluctuations_ and genuine SUT noise.
Konstantynowicz & Polak Expires 11 January 2024 [Page 27]
Internet-Draft MLRsearch July 2023
A simple model of SUT performance consists of a baseline _noiseless
performance_, and an additional noise. The baseline is assumed to be
constant (enough). The noise varies in time, sometimes wildly. The
noise can sometimes be negligible, but frequently it lowers the
observed SUT performance in a trial.
In this model, SUT does not have a single performance value, it has a
spectrum. One end of the spectrum is the noiseless baseline, the
other end is a _noiseful performance_. In practice, trial results
close to the noiseful end of the spectrum happen only rarely. The
worse performance, the more rarely it is seen in a trial.
Focusing on DUT, the benchmarking effort should aim at eliminating
only the SUT noise from SUT measurement. But that is not really
possible, as there are no realistic enough models able to distinguish
SUT noise from DUT fluctuations.
However, assuming that a well-constructed SUT has the DUT as its
performance bottleneck, the "DUT noiseless performance" can be
defined as the noiseless end of SUT performance spectrum. (At least
for throughput. For other quantities such as latency there will be
an additive difference.) By this definition, DUT noiseless
performance also minimizes the impact of DUT fluctuations.
In this document, we reduce the "DUT within SUT" problem to
estimating the noiseless end of SUT performance spectrum from a
limited number of trial results.
Any improvements to throughput search algorithm, aimed for better
dealing with software networking SUT and DUT setup, should employ
strategies recognizing the presence of SUT noise, and allow discovery
of (proxies for) DUT noiseless performance at different levels of
sensitivity to SUT noise.
3.3. Repeatability and Comparability
[RFC2544] does not suggest to repeat throughput search. And from
just one throughput value, it cannot be determined how repeatable
that value is. In practice, poor repeatability is also the main
cause of poor comparability, e.g. different benchmarking teams can
test the same SUT but get different throughput values.
[RFC2544] throughput requirements (60s trial, no tolerance to single
frame loss) force the search to fluctuate close the noiseful end of
SUT performance spectrum. As that end is affected by rare trials of
significantly low performance, the resulting throughput repeatability
is poor.
Konstantynowicz & Polak Expires 11 January 2024 [Page 28]
Internet-Draft MLRsearch July 2023
The repeatability problem is the problem of defining a search
procedure which reports more stable results (even if they can no
longer be called "throughput" in [RFC2544] sense). According to
baseline (noiseless) and noiseful model, better repeatability will be
at the noiseless end of the spectrum. Therefore, solutions to the
"DUT within SUT" problem will help also with the repeatability
problem.
Conversely, any alteration to [RFC2544] throughput search that
improves repeatability should be considered as less dependent on the
SUT noise.
An alternative option is to simply run a search multiple times, and
report some statistics (e.g. average and standard deviation). This
can be used for "important" tests, but it makes the search duration
problem even more pronounced.
3.4. Throughput with Non-Zero Loss
[RFC1242] (section 3.17) defines throughput as: The maximum rate at
which none of the offered frames are dropped by the device.
and then it says: Since even the loss of one frame in a data stream
can cause significant delays while waiting for the higher level
protocols to time out, it is useful to know the actual maximum data
rate that the device can support.
Contrary to that, many benchmarking teams settle with non-zero
(small) loss ratio as the goal for a "throughput rate".
Motivations are many: modern protocols tolerate frame loss better;
trials nowadays send way more frames within the same duration; impact
of rare noise bursts is smaller as the baseline performance can
compensate somewhat by keeping the loss ratio below the goal; if SUT
noise with "ideal DUT" is known, it can be set as the loss ratio
goal.
Regardless of validity of any and all similar motivations, support
for non-zero loss goals makes any search algorithm more user-
friendly. [RFC2544] throughput is not friendly in this regard.
Searching for multiple goal loss ratios also helps to describe the
SUT performance better than a single goal result. Repeated wide gap
between zero and non-zero loss conditional throughputs indicates the
noise has a large impact on the overall SUT performance.
Konstantynowicz & Polak Expires 11 January 2024 [Page 29]
Internet-Draft MLRsearch July 2023
It is easy to modify the vanilla bisection to find a lower bound for
intended load that satisfies a non-zero-loss goal, but it is not that
obvious how to search for multiple goals at once, hence the support
for multiple loss goals remains a problem.
3.5. Inconsistent Trial Results
While performing throughput search by executing a sequence of
measurement trials, there is a risk of encountering inconsistencies
between trial results.
The plain bisection never encounters inconsistent trials. But
[RFC2544] hints about possibility if inconsistent trial results in
two places. The first place is section 24 where full trial durations
are required, presumably because they can be inconsistent with
results from shorter trial durations. The second place is section
26.3 where two successive zero-loss trials are recommended,
presumably because after one zero-loss trial there can be subsequent
inconsistent non-zero-loss trial.
Examples include:
* a trial at the same load (same or different trial duration)
results in a different packet loss ratio.
* a trial at higher load (same or different trial duration) results
in a smaller packet loss ratio.
Any robust throughput search algorithm needs to decide how to
continue the search in presence of such inconsistencies. Definitions
of throughput in [RFC1242] and [RFC2544] are not specific enough to
imply a unique way of handling such inconsistencies.
Ideally, there will be a definition of a quantity which both
generalizes throughput for non-zero-loss (and other possible
repeatibility enhancements), while being precise enough to force a
specific way to resolve trial inconsistencies. But until such
definition is agreed upon, the correct way to handle inconsistent
trial results remains an open problem.
4. How the problems are addressed
Configurable loss ratio in MLRsearch search goals are there in direct
support for non-zero-loss conditional throughput. In practice the
conditional throughput results' stability increases with higher loss
ratio goals.
Konstantynowicz & Polak Expires 11 January 2024 [Page 30]
Internet-Draft MLRsearch July 2023
Multiple trials with noise tolerance enhancement, as implemented in
MLRsearch using non-zero goal exceed ratio value, also indirectly
increases the result stability. That allows MLRsearch to achieve all
the benefits of Binary Search with Loss Verification, as recommended
in [RFC9004] (section 6.2) and specified in [TST009] (section
12.3.3).
The main factor improving the overall search time is the introduction
of preceding targets. Less impactful time savings are achieved by
pre-initial trials, halving mode and smart splitting in bisecting
mode.
In several places, MLRsearch is "conservative" when handling
(potentially) inconsistent results. This includes the requirement
for the relevant lower bound to be smaller than any upper bound, the
unequal handling of good and bad short trials, and preference to
lower load when choosing the winner among candidates.
While this does no guarantee good search stability (goals focusing on
higher loads may still invalidate existing bounds simply by requiring
larger min duration sums), it lowers the change of SUT having an area
of poorer performance below the reported conditional througput loads.
In any case, the definition of conditional throughput is precise
enough to dictate "conservative" handling of trial inconsistencies.
5. IANA Considerations
No requests of IANA.
6. Security Considerations
Benchmarking activities as described in this memo are limited to
technology characterization of a DUT/SUT using controlled stimuli in
a laboratory environment, with dedicated address space and the
constraints specified in the sections above.
The benchmarking network topology will be an independent test setup
and MUST NOT be connected to devices that may forward the test
traffic into a production network or misroute traffic to the test
management network.
Further, benchmarking is performed on a "black-box" basis, relying
solely on measurements observable external to the DUT/SUT.
Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
benchmarking purposes. Any implications for network security arising
from the DUT/SUT SHOULD be identical in the lab and in production
networks.
Konstantynowicz & Polak Expires 11 January 2024 [Page 31]
Internet-Draft MLRsearch July 2023
7. Acknowledgements
Many thanks to Alec Hothan of OPNFV NFVbench project for thorough
review and numerous useful comments and suggestions.
8. References
8.1. Normative References
[RFC1242] Bradner, S., "Benchmarking Terminology for Network
Interconnection Devices", RFC 1242, DOI 10.17487/RFC1242,
July 1991, <https://www.rfc-editor.org/info/rfc1242>.
[RFC2285] Mandeville, R., "Benchmarking Terminology for LAN
Switching Devices", RFC 2285, DOI 10.17487/RFC2285,
February 1998, <https://www.rfc-editor.org/info/rfc2285>.
[RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for
Network Interconnect Devices", RFC 2544,
DOI 10.17487/RFC2544, March 1999,
<https://www.rfc-editor.org/info/rfc2544>.
[RFC9004] Morton, A., "Updates for the Back-to-Back Frame Benchmark
in RFC 2544", RFC 9004, DOI 10.17487/RFC9004, May 2021,
<https://www.rfc-editor.org/info/rfc9004>.
8.2. Informative References
[FDio-CSIT-MLRsearch]
"FD.io CSIT Test Methodology - MLRsearch", November 2022,
<https://csit.fd.io/cdocs/methodology/measurements/
data_plane_throughput/mlr_search/>.
[PyPI-MLRsearch]
"MLRsearch 0.4.0, Python Package Index", April 2021,
<https://pypi.org/project/MLRsearch/0.4.0/>.
[TST009] "TST 009", n.d., <https://www.etsi.org/deliver/etsi_gs/
NFV-TST/001_099/009/03.04.01_60/gs_NFV-
TST009v030401p.pdf>.
Authors' Addresses
Maciek Konstantynowicz
Cisco Systems
Email: mkonstan@cisco.com
Konstantynowicz & Polak Expires 11 January 2024 [Page 32]
Internet-Draft MLRsearch July 2023
Vratko Polak
Cisco Systems
Email: vrpolak@cisco.com
Konstantynowicz & Polak Expires 11 January 2024 [Page 33]