Skip to content

[RNTuple] Implement I/O Performance Metrics: Sparseness, Randomness, and Transactions#20994

Open
JasMehta08 wants to merge 1 commit intoroot-project:masterfrom
JasMehta08:performance-metrics-RNTuple
Open

[RNTuple] Implement I/O Performance Metrics: Sparseness, Randomness, and Transactions#20994
JasMehta08 wants to merge 1 commit intoroot-project:masterfrom
JasMehta08:performance-metrics-RNTuple

Conversation

@JasMehta08
Copy link
Contributor

This Pull request:

improves the existing Performance Metrics in RNTuple.

Changes or fixes:

This PR addresses the issue that had been raised for the improvements of metrics. It implements,

  • GetSparseness(): Calculates the ratio of explicit payload bytes read to the total file size.
  • GetRandomness(): Calculates the ratio of seek distance to bytes read (identifies inefficient access patterns).
  • GetTransactions(): Tracks the total number of physical I/O operations (now using a dedicated fTransactions atomic counter).

Implementation Details:

These have been implemented through the introduction of transient, thread-safe members in RNTupleMetrics,

  • std::atomicstd::uint64_t fSumSkip: To track total seek distance.

  • std::atomicstd::uint64_t fExplicitBytesRead: To track payload bytes.

  • std::atomicstd::uint64_t fTransactions: To track I/O operations.

I have changed RPageStorageFile to update these counters during ReadBuffer, ReadV, and LoadClusters operations. The use of std::atomic ensures thread safety with negligible performance overhead on the I/O path.

Reset Functionality & Accuracy:
I added a Reset() method to RNTupleMetrics. This is needed for obtaining accurate Randomness metrics for the analysis loop.

  • Without Reset(): The metric is dominated by the initial file seek (Header → Footer → Header), resulting in a Randomness score > 2.0.

  • With Reset(): Users can clear the counters after initialization, isolating the steady-state performance of their event loop.

Verification:

Added a new unit test TEST(Metrics, IOMetrics) in tree/ntuple/test/ntuple_metrics.cxx to verify the logic and the Reset() behavior.

Also tested locally using this code

Click to view Manual Verification Code & Output
#include <ROOT/RNTupleModel.hxx>
#include <ROOT/RNTupleWriter.hxx>
#include <ROOT/RNTupleReader.hxx>
#include <ROOT/RNTupleMetrics.hxx>
#include <iostream>

int main()
{
   using namespace ROOT::Experimental;
   using namespace ROOT;


   std::string ntupleName = "ntuple";
   std::string fileName = "verify_metrics.root";

   // Create a dummy RNTuple
   std::cout << "[INFO] Creating dummy RNTuple..." << std::endl;
   {
      auto model = RNTupleModel::Create();
      auto field = model->MakeField<int>("val");
      auto writer = RNTupleWriter::Recreate(std::move(model), ntupleName, fileName);
      for (int i = 0; i < 1000; ++i) {
         *field = i;
         writer->Fill();
      }
   }

   // Read it back
   std::cout << "[INFO] Reading RNTuple and enabling metrics..." << std::endl;
   auto reader = RNTupleReader::Open(ntupleName, fileName);
   reader->EnableMetrics();

   // Read everything to force I/O
   for (auto entry : *reader) {
      (void)entry;
   }

   // Print Metrics
   auto& metrics = reader->GetMetrics();
   
   std::cout << "\n===== RNTuple Performance Metrics Report =====" << std::endl;
   std::cout << "Transactions: " << metrics.GetTransactions() << " (Expected > 0)" << std::endl;
   std::cout << "Sparseness:   " << metrics.GetSparseness() << " (Expected > 0.0)" << std::endl;
   std::cout << "Randomness:   " << metrics.GetRandomness() << " (Expected > 0.0)" << std::endl;
   std::cout << "==============================================\n" << std::endl;

   // Verification Logic
   bool passed = true;
   if (metrics.GetTransactions() <= 0) {
       std::cerr << "[FAIL] Transactions should be > 0" << std::endl;
       passed = false;
   }
   if (metrics.GetSparseness() <= 0.0 || metrics.GetSparseness() > 1.0) {
       std::cerr << "[FAIL] Sparseness should be in (0.0, 1.0]" << std::endl;
       passed = false;
   }
   if (metrics.GetRandomness() <= 0.0) {
       std::cerr << "[FAIL] Randomness should be > 0.0" << std::endl;
       passed = false;
   }

   // Verify Reset functionality
   std::cout << "[INFO] Testing Reset()..." << std::endl;
   reader->GetMetrics().Reset();
   
   // We expect Randomness to go from ~2.13 down to 0.0
   if (metrics.GetSparseness() == 0.0 && metrics.GetRandomness() == 0.0) {
       std::cout << "[SUCCESS] Reset() worked! Randomness dropped to 0.0" << std::endl;
   } else {
       std::cerr << "[FAIL] Reset() failed! Counters are not zero." << std::endl;
       passed = false;
   }
   
   if (passed) {
       std::cout << "[SUCCESS] All metrics logic verified!" << std::endl;
       return 0;
   } else {
       std::cout << "[FAILURE] Some metrics checks failed." << std::endl;
       return 1;
   }
}
[INFO] Reading RNTuple and enabling metrics...

===== RNTuple Performance Metrics Report =====
Transactions: 2 (Expected > 0)
Sparseness:   0.181648 (Expected > 0.0)
Randomness:   2.13746 (Expected > 0.0)
==============================================

[INFO] Testing Reset()...
[SUCCESS] Reset() worked! Randomness dropped to 0.0
[SUCCESS] All metrics logic verified!
#include <ROOT/RNTupleModel.hxx>
#include <ROOT/RNTupleWriter.hxx>
#include <ROOT/RNTupleReader.hxx>
#include <ROOT/RNTupleMetrics.hxx>
#include <iostream>
#include <cstdlib>

int main()
{
   using namespace ROOT::Experimental;
   using namespace ROOT;

   std::string ntupleName = "ntuple";
   std::string fileName = "verify_metrics_v2.root";

   // Create Data (50MB analysis dataset)
   {
      auto model = RNTupleModel::Create();
      auto fieldPt = model->MakeField<float>("pt");
      auto fieldEta = model->MakeField<float>("eta");
      auto fieldPhi = model->MakeField<float>("phi");

      // Realistic cluster size for analysis (1MB)
      RNTupleWriteOptions options;
      options.SetApproxZippedClusterSize(1024 * 1024); 
      
      auto writer = RNTupleWriter::Recreate(std::move(model), ntupleName, fileName, options);
      
      std::cout << "[INFO] Creating 50MB analysis dataset..." << std::endl;
      // ~4.2M events × 12 bytes/event = ~50MB
      for (int i = 0; i < 4200000; ++i) {
         *fieldPt = 20.0f + static_cast<float>(std::rand() % 100);
         *fieldEta = -2.5f + static_cast<float>(std::rand() % 500) / 100.0f;
         *fieldPhi = -3.14f + static_cast<float>(std::rand() % 628) / 100.0f;
         writer->Fill();
         
         if (i % 500000 == 0) {
            std::cout << "  Written " << (i / 1000) << "k events..." << std::endl;
         }
      }
      std::cout << "[INFO] Dataset created successfully." << std::endl;
   }

   // Open and IMMEDIATE RESET
   std::cout << "[INFO] Opening RNTuple..." << std::endl;
   auto reader = RNTupleReader::Open(ntupleName, fileName);
   reader->EnableMetrics();
   
   // FORCE the first seek (Tail -> Head) so we can clear it
   try {
        // LoadEntry is void but might throw if out of range, 0 is safe here
        reader->LoadEntry(0); 
   } catch (...) {}

   auto& m_pre = reader->GetMetrics();
   std::cout << "\n[DEBUG] Pre-Reset Metrics:" << std::endl;
   std::cout << "  Transactions: " << m_pre.GetTransactions() << std::endl;
   std::cout << "  Randomness:   " << m_pre.GetRandomness() << std::endl;
   std::cout << "  Total File Size: " << m_pre.GetTotalFileSize() << " bytes" << std::endl;
   std::cout << "  N Entries: " << reader->GetNEntries() << std::endl;

   std::cout << "[INFO] Resetting Metrics to isolate analysis phase..." << std::endl;
   reader->GetMetrics().Reset();

   // Analysis Loop (actually access the data)
   std::cout << "[INFO] Running analysis on " << reader->GetNEntries() << " events..." << std::endl;
   
   auto viewPt = reader->GetView<float>("pt");
   auto viewEta = reader->GetView<float>("eta");
   auto viewPhi = reader->GetView<float>("phi");
   
   long selectedEvents = 0;
   double sumPt = 0.0;
   
   for (auto i : reader->GetEntryRange()) {
      float pt = viewPt(i);
      float eta = viewEta(i);
      float phi = viewPhi(i);
      
      if (pt > 25.0 && std::abs(eta) < 2.4) {
         selectedEvents++;
         sumPt += pt;
      }
      
      if (i % 500000 == 0 && i > 0) {
         std::cout << "  Processed " << (i / 1000) << "k events..." << std::endl;
      }
   }
   
   std::cout << "[INFO] Analysis complete. Selected " << selectedEvents 
             << " events, <pT> = " << (sumPt / selectedEvents) << " GeV" << std::endl;

   auto& metrics = reader->GetMetrics();

   // Report Analysis Metrics
   std::cout << "\n===== Analysis Performance Report =====" << std::endl;
   std::cout << "Transactions: " << metrics.GetTransactions() << " (Expected > 0)" << std::endl;
   std::cout << "Sparseness:   " << metrics.GetSparseness() << std::endl;
   std::cout << "Randomness:   " << metrics.GetRandomness() << " (Expected low for sequential)" << std::endl;
   std::cout << "======================================\n" << std::endl;

   if (metrics.GetTransactions() > 0 && metrics.GetRandomness() < 1.0) {
       std::cout << "[SUCCESS] Analysis metrics look good!" << std::endl;
       std::cout << "          Sequential reading achieved (low randomness)." << std::endl;
       return 0;
   } else {
       std::cerr << "[FAIL] Unexpected metrics pattern." << std::endl;
       std::cerr << "       Transactions: " << metrics.GetTransactions() << std::endl;
       std::cerr << "       Randomness: " << metrics.GetRandomness() << std::endl;
       return 1;
   }
}
[INFO] Creating 50MB analysis dataset...
  Written 0k events...
  Written 500k events...
  Written 1000k events...
  Written 1500k events...
  Written 2000k events...
  Written 2500k events...
  Written 3000k events...
  Written 3500k events...
  Written 4000k events...
[INFO] Dataset created successfully.
[INFO] Opening RNTuple...

[DEBUG] Pre-Reset Metrics:
  Transactions: 4
  Randomness:   24.7884
  Total File Size: 0 bytes
  N Entries: 4200000
[INFO] Resetting Metrics to isolate analysis phase...
[INFO] Running analysis on 4200000 events...
  Processed 500k events...
  Processed 1000k events...
  Processed 1500k events...
  Processed 2000k events...
  Processed 2500k events...
  Processed 3000k events...
  Processed 3500k events...
  Processed 4000k events...
[INFO] Analysis complete. Selected 3782202 events, <pT> = 72.4999 GeV

===== Analysis Performance Report =====
Transactions: 24 (Expected > 0)
Sparseness:   0.919237
Randomness:   4.08388e-05 (Expected low for sequential)
======================================

[SUCCESS] Analysis metrics look good!
          Sequential reading achieved (low randomness).
## Checklist:
  • tested changes locally
  • updated the docs (if necessary)

This PR fixes #20853

Copy link
Contributor

@jblomer jblomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Before going into a detailed review, I see a few larger items that need to be addressed.

  • The metrics should not be added directly as atomic integers to the RNTupleMetrics object. We have RNTuplePerfCounter for this.
  • Nice observation on the randomness metric. However, I would prefer doing this internally, e.g. excluding header and footer or having two metrics (with and without).
  • I think it makes more sense to do the accounting in the fReader; should save us to chase all the invocations of ReadBuffer.

@JasMehta08
Copy link
Contributor Author

@jblomer Thanks for the review. I have updated the PR to address your three main points:

Counters: I replaced the std::atomic members with RNTupleAtomicCounter (using the standard RNTuplePerfCounter).

Accounting: I moved all I/O accounting into RMiniFileReader (specifically ReadBuffer and ReadV). RPageStorageFile now just passes the metrics pointer to the reader.

Randomness Logic: As suggested, I handled the exclusion internally rather than via Reset().

I added an internal fSumSkipAnalysis counter.

The reader automatically switches to the "Analysis Phase" at the end of Open().

GetRandomness() calculates the ratio using only the analysis-phase seeks, effectively ignoring header/footer overhead without user intervention.

@JasMehta08
Copy link
Contributor Author

While verifying the changes, I noticed a pre-existing issue with RPageSourceFile. When the source object is moved (e.g. into the Reader), the internal fReader still points to the old fMetrics address.

This results in 0 metrics during verification tests that snapshot the reader across a move. It likely doesn't affect production (where the reader owns the source), but it breaks the "Snapshot & Subtract" verification logic.

I have not added a custom Move Constructor to fix this yet, as it felt like a separate architectural fix outside the scope of this metrics refactor. Let me know if you would prefer I include that fix here.

@JasMehta08
Copy link
Contributor Author

@jblomer

I pushed a polish commit addressing 3 specific things that i noticed while reviewing:

  • Object Lifetimes (CloneImpl): Explicitly documented the raw pointer dependency with a // SAFETY comment. I kept the raw pointer pattern to match the existing RNTuple design (Reader owns Views).

    • Question: Is this "Contract-Based Safety" acceptable here, or would std::weak_ptr be better for strict lifetime enforcement?
  • Concurrency Docs: Added comments clarifying that fLastOffset is thread-local (per Clone), while fMetrics uses shared atomics.

  • UB Safety: seek distance logic to use purely unsigned arithmetic (offset >= fLastOffset ? ) to eliminate theoretical signed integer overflow (even though unlikely).

@JasMehta08 JasMehta08 requested a review from jblomer February 2, 2026 16:44
@jblomer
Copy link
Contributor

jblomer commented Feb 2, 2026

I'm afraid this isn't going quite in the right direction. Take a look at ROOT::Internal::RPageSource::EnableDefaultMetrics, we already have some computed metrics that you can use as a blueprint.

@JasMehta08 JasMehta08 force-pushed the performance-metrics-RNTuple branch from fc79786 to c92fc14 Compare February 3, 2026 16:01
@JasMehta08
Copy link
Contributor Author

hey @jblomer, Thanks for the review, sorry about miss understanding the previous review.

I moved the implementation to follow the EnableDefaultMetrics blueprint.

please let me know if they are in the right directions, or should i look a bit deeper in the library.

On a side note RNTuple seems very interesting to me, I'm interested in the low-level systems engineering side of this and learning about columnar data structures in general, could you let me know if there is anyway to deep dive and understand it even better (i did go through the root reference guide i could only find an introduction page. And i also went through the youtube video on the HSF youtube channel but it did not have a lot of details).

please have a review and let me know if these changes are as desired.

Thanks!

@JasMehta08 JasMehta08 force-pushed the performance-metrics-RNTuple branch from c92fc14 to 5a8845d Compare February 3, 2026 17:28
@jblomer
Copy link
Contributor

jblomer commented Feb 12, 2026

Thank you!

That looks better. One issue here is still that the concrete metrics are added to the RNTupleMetrics class. This is not the right place. The RNTupleMetrics is a generic binder for metrics. It is the user of the RNTupleMetrics class (e.g. RPageSource) that implements the metric-specific logic .

As an introduction documentation, I recommend the write-ups in the doc folder of the RNTuple sources. Start with the architecture.md (the architectural overview).

@JasMehta08
Copy link
Contributor Author

@jblomer,

I removed the concrete methods from RNTupleMetrics class. Please have a look and let me know if I should change anything.

Note: I changed the GetCounter() method so that, Users can access metrics via GetCounter() with either short names or full paths:

  • metrics.GetCounter("randomness")
  • metrics.GetCounter("RNTupleReader.RPageSourceFile.randomness")

as having to write the whole name for randomness would be tedious.

Both work identically. Once confirmed by you that everything has been implemented properly, I will add a tutorial and unit test (assuming that would be desired).

Thanks!

@jblomer
Copy link
Contributor

jblomer commented Feb 26, 2026

Thank you!

I think we should only address metrics by their full name. I'm not convinced by the short name as it may result in name collisions. In any case, I think that particular functionality should go in a separate PR.

@jblomer
Copy link
Contributor

jblomer commented Feb 26, 2026

Also, please rebase/squash your commits for a clean history.

@JasMehta08
Copy link
Contributor Author

JasMehta08 commented Feb 26, 2026

@jblomer
Thank you for the review, yes i agree about the short name ones, i had actually realised that it should be a separate PR, if even implemented, just missed removing it from the PR, sorry about that, so will do that.

Also other than that i will add a test for this as well as please let me know if i would have to change some documentation for it or not.

I hope the other changes are good. I will ask for a review after making the changes suggested and also adding tests.

Thanks!

@JasMehta08 JasMehta08 force-pushed the performance-metrics-RNTuple branch from 9e2507c to ba15fe6 Compare February 26, 2026 23:24
@JasMehta08
Copy link
Contributor Author

@jblomer
I have removed the short name update, i added a simple test to make sure that the metrics work, and also squashed the commits into one.

Please let me know if i should change anything else.

Thanks!

Comment on lines 649 to 652
ROOT::Experimental::Detail::RNTupleAtomicCounter &fSumSkip;
ROOT::Experimental::Detail::RNTupleAtomicCounter &fTotalFileSize;
ROOT::Experimental::Detail::RNTupleCalcPerf &fRandomness;
ROOT::Experimental::Detail::RNTupleCalcPerf &fSparseness;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These counters only make sense for files, so should move to RPageSourceFile

Comment on lines 150 to 151
/// Flag to track if total file size has been set (set lazily on first page load when metrics are enabled)
bool fFileSizeSet = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use a negative file size as a flag and don't need the additional member.

@@ -555,6 +542,30 @@ void ROOT::Internal::RPageSource::EnableDefaultMetrics(const std::string &prefix
}
}
return {false, -1.};
}),
*fMetrics.MakeCounter<RNTupleAtomicCounter *>("sumSkip", "B", "cumulative seek distance"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
*fMetrics.MakeCounter<RNTupleAtomicCounter *>("sumSkip", "B", "cumulative seek distance"),
*fMetrics.MakeCounter<RNTupleAtomicCounter *>("szSkip", "B", "cumulative seek distance"),

@@ -555,6 +542,30 @@ void ROOT::Internal::RPageSource::EnableDefaultMetrics(const std::string &prefix
}
}
return {false, -1.};
}),
*fMetrics.MakeCounter<RNTupleAtomicCounter *>("sumSkip", "B", "cumulative seek distance"),
*fMetrics.MakeCounter<RNTupleAtomicCounter *>("totalFileSize", "B", "total file size"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
*fMetrics.MakeCounter<RNTupleAtomicCounter *>("totalFileSize", "B", "total file size"),
*fMetrics.MakeCounter<RNTupleAtomicCounter *>("szFile", "B", "total file size"),

@@ -650,6 +646,10 @@ protected:
ROOT::Experimental::Detail::RNTupleCalcPerf &fBandwidthUnzip;
ROOT::Experimental::Detail::RNTupleCalcPerf &fFractionReadOverhead;
ROOT::Experimental::Detail::RNTupleCalcPerf &fCompressionRatio;
ROOT::Experimental::Detail::RNTupleAtomicCounter &fSumSkip;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ROOT::Experimental::Detail::RNTupleAtomicCounter &fSumSkip;
ROOT::Experimental::Detail::RNTupleAtomicCounter &fSzSkip;


TEST(Metrics, IOMetrics)
{
std::string rootFileName{"test_ntuple_io_metrics.root"};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed, you can ask the file guard for the path

@@ -515,16 +505,23 @@ ROOT::Internal::RPageRef ROOT::Internal::RPageSourceFile::LoadPageImpl(ColumnHan
directReadBuffer = MakeUninitArray<unsigned char>(sealedPage.GetBufferSize());
{
RNTupleAtomicTimer timer(fCounters->fTimeWallRead, fCounters->fTimeCpuRead);
fReader.ReadBuffer(directReadBuffer.get(), sealedPage.GetBufferSize(),
pageInfo.GetLocator().GetPosition<std::uint64_t>());
auto offset = pageInfo.GetLocator().GetPosition<std::uint64_t>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto offset = pageInfo.GetLocator().GetPosition<std::uint64_t>();
const auto offset = pageInfo.GetLocator().GetPosition<std::uint64_t>();

auto offset = pageInfo.GetLocator().GetPosition<std::uint64_t>();
// Track seek distance
if (fLastOffset != 0) {
std::int64_t diff = static_cast<std::int64_t>(offset) - static_cast<std::int64_t>(fLastOffset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::int64_t diff = static_cast<std::int64_t>(offset) - static_cast<std::int64_t>(fLastOffset);
const std::int64_t diff = static_cast<std::int64_t>(offset) - static_cast<std::int64_t>(fLastOffset);

Comment on lines 512 to 513
std::uint64_t distance = (diff >= 0) ? diff : -diff;
fCounters->fSumSkip.Add(distance);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::uint64_t distance = (diff >= 0) ? diff : -diff;
fCounters->fSumSkip.Add(distance);
fCounters->fSumSkip.Add(std::abs(diff));

Comment on lines 536 to 543
// Track seek distance based on page location even when using cluster pool
auto offset = pageInfo.GetLocator().GetPosition<std::uint64_t>();
if (fLastOffset != 0) {
std::int64_t diff = static_cast<std::int64_t>(offset) - static_cast<std::int64_t>(fLastOffset);
std::uint64_t distance = (diff >= 0) ? diff : -diff;
fCounters->fSumSkip.Add(distance);
}
fLastOffset = offset + sealedPage.GetBufferSize();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong, we need to track the actual read requests in LoadClusters()

@JasMehta08 JasMehta08 force-pushed the performance-metrics-RNTuple branch 2 times, most recently from a47c1e3 to cf6fb09 Compare February 28, 2026 05:40
@JasMehta08 JasMehta08 force-pushed the performance-metrics-RNTuple branch from cf6fb09 to 2e564f6 Compare February 28, 2026 05:46
@JasMehta08
Copy link
Contributor Author

@jblomer Thank you for the review,

I have made all the changes as suggested,
The only additional change that was needed was for moving the setting of the file size to AttachImpl() we would have to track the size using another variable as the metrics are not enabled by that point,

I have used a member variable to store the file size fFileSize(let me know if i should change the name), and now the metric fSzFile is a computed metric rather than it storing the size it returns fFileSIze.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ntuple] add several useful performance metrics

2 participants