Feature/schema endpoint #21836

knakatasf · 2025-07-02T17:36:16Z

Scope & Purpose

Added schema endpoint to show the overview of a database collection.
(Please describe the changes in this PR for reviewers, motivation, rationale - mandatory)

💩 Bugfix
🍕 New feature
🔥 Performance improvement
🔨 Refactoring/simplification

Checklist

Related Information

(Please reference tickets / specification / other PRs etc)

Docs PR: HTTP API: Schema endpoint docs-hugo#736
Enterprise PR:
GitHub issue / Jira ticket:
Design document:

…g functions

… type; added tests for schema/view endpoint

…ema-endpoint

mchacki

Sorry I did not complete all files, I will continue tomorrow.
I already have some feedback in the comments.

In general what I have seen looks already pretty good 👍

mchacki · 2025-07-03T13:27:15Z

CHANGELOG

@@ -1,5 +1,9 @@
 devel
 -----
+* Add new API GET /_api/schema to show graphs, views, collections along with


Suggested change

* Add new API GET /_api/schema to show graphs, views, collections along with

* Add new API GET /_api/schema to show the sampled schema of graphs, views, collections along with

mchacki · 2025-07-03T14:26:59Z

arangod/RestHandler/RestSchemaHandler.cpp

+/// Copyright 2014-2025 ArangoDB GmbH, Cologne, Germany
+/// Copyright 2004-2014 triAGENS GmbH, Cologne, Germany


Suggested change

/// Copyright 2014-2025 ArangoDB GmbH, Cologne, Germany

/// Copyright 2004-2014 triAGENS GmbH, Cologne, Germany

/// Copyright 2025-2025 ArangoDB GmbH, Cologne, Germany

mchacki · 2025-07-03T14:35:13Z

arangod/RestHandler/RestSchemaHandler.cpp

+  }
+
+  auto const& suffix = _request->suffixes();
+  switch (suffix.size()) {


Uh, I think we forgot to talk about user rights to access this.

You can see how it would be done here:
https://github.com/arangodb/arangodb/blob/b14b418811b3f4e292429310a5de41f78325b957/arangod/RestHandler/RestAdminClusterHandler.cpp#L2367C1-L2372C4

My suggestion right now:
RO on "canUseDatabase" and "RW" on canUseCollection" for the collection schema.
And RW on "canUseDatabase" for everything else.

We may lift this later.

I added permission check if blocks and will show them on Friday.

mchacki · 2025-07-03T14:39:20Z

arangod/RestHandler/RestSchemaHandler.cpp

+  return RestStatus::DONE;
+}
+
+RestStatus RestSchemaHandler::lookupSchema(uint64_t sampleNum,


I would make this also return a Result, and move the complete "write response or write error" into the execute method.

I changed the logic of those helper methods and will show them on Friday.

mchacki · 2025-07-03T14:40:57Z

arangod/RestHandler/RestSchemaHandler.cpp

+  if (graphsRes.fail()) {
+    return RestStatus::DONE;
+  }


This code is correct when looking into above method.
However this looks strange, as suggested above I would not have any function return the RestStatus.
But all functions return a Result.
And then in the execute react on the Result.fail() in one central place one time.
Then it is clear that on any error case the error message is written.

WDYT?

I agree that one central place should react all the Result.fail() so it will be clear that where the error came from. On the other hand, I called generateError() in the stemmed method so the error message is concrete.

mchacki · 2025-07-03T14:47:39Z

arangod/RestHandler/RestSchemaHandler.cpp

+  if (!data.hasKey("links")) {
+    Result res{TRI_ERROR_ARANGO_DATA_SOURCE_NOT_FOUND,
+               std::format("View {} has no links", viewName)};
+    generateError(ResponseCode::NOT_FOUND, res.errorNumber(),
+                  res.errorMessage());


🤔 I think having a view without links is valid.
it is not useful in production though, but we may just look at a bad point in time (e.g. during creation of the view)
I think I would let this through without an error.
Then of course if asking for the view it should give an empty list of collections.

I agree. So, I changed the method so when data doesn't have links yet, it will be an empty JSON array instead of generateError().

mchacki · 2025-07-03T14:49:15Z

arangod/RestHandler/RestSchemaHandler.cpp

+  linksBuilder.openArray();
+  for (auto li : velocypack::ObjectIterator(data.get("links"))) {
+    auto colName = li.key.copyString();
+    auto colValue = li.value;


I think it is valid to have a View with "includeAllFields: true" would this error out here?

I hadn't been aware of includeAllFields actually. I implemented a method to fetch all the attributes and its analyzers with the limit of sampleNum, but I have some questions on this.

mchacki · 2025-07-03T14:51:43Z

arangod/RestHandler/RestSchemaHandler.cpp

+    velocypack::Builder colBuilder;
+    colBuilder.openObject();
+    colBuilder.add("collectionName", velocypack::Value(colName));
+    colSet.insert(colName);
+    colBuilder.add("fields", fieldsBuilder.slice());
+    colBuilder.close();
+    linksBuilder.add(colBuilder.slice());


This code is correct.

However it would need to copy the information from the fieldsBuilder into the colBuilder, and then yet again into the linksBuilder.

Can you instead write the information directly into the links builder?
I think if you start with linksBuilder.openObject(); at the beginning of the loop you can write into it directly.

Maybe you want to even check the ObjectGuard (this directly does a close() for you when getting out of scope, so you cannot forget it.

I changed so it directly writes to one viewBuilder instead of instantiating many Builder object and copying.

dothebart · 2025-07-03T15:05:14Z

Please remember adding rta-makedata tests, so dump & restore can be tested as well.

…ema-endpoint

…r permission check, and includeAllFields option

…ema-endpoint

…added tests

mchacki

This PR is already in a very good shape!
I have some mostly minor requests to change.

mchacki · 2025-07-09T09:34:00Z

arangod/GeneralServer/GeneralServerFeature.cpp

@@ -693,6 +694,11 @@ void GeneralServerFeature::defineRemainingHandlers(
      RestHandlerCreator<RestSimpleHandler>::createData<aql::QueryRegistry*>,
      queryRegistry);

+  f.addPrefixHandler(
+      "/_api/schema",


We typically define those as static strings.

Please look at arangod/RestHandler/RestVocbaseBaseHandler.cpp and arangod/RestHandler/RestVocbaseBaseHandler.h
And add the Schema path there.

mchacki · 2025-07-09T09:40:09Z

tests/js/client/shell/api/rest-schema-handler.js

+            );
+            gm._create(
+                GRAPH_MANUFACTURE,
+                [ gm._relation(EDGE_MANUFACTURED, COLLECTION_COMPANIES, COLLECTION_PRODUCTS) ]


Just for the sake of completeness:

Can you make one of these two graphs hold two Relations e.g. add EDGE_CONTAINS, COLLECTION_PRODUCTS, COLLECTION_MATERIALS to the GRAPH_MANUFACTURE.
So we can see the API can handle multiple edge collections, and handle overlapping vertex collections.

I think I have not talked about orphan collections yet.
A graph can have Vertex collections that do not have any connected Edge Collection. They are added in the third parameter which is an Array. Like this:
gm._create("myGraph", [gm._relation("e", "v","v")], ["orphan"]);
It should be visible as Orphans in the GraphSchema.
And more importantly the orphan needs to be part of the Collection list when asked for the Graph Schema.

Refactored the code:
(1) GraphSchema now has the attribute orphans and orphan collections are also shown in the collections list.

(2) Added the case where a graph has 2 relations, from and to has 2 documents, and orphans has 2 documents.

(3) Added assertions for the case of (3).

(4) This line ensures that no duplicated collections are shown.

mchacki · 2025-07-09T09:41:02Z

tests/js/client/shell/api/rest-schema-handler.js

+            productsCollection.dropIndex(prodIndex);
+            customersCollection.dropIndex(cusIndex);


This is okay, but not required.
Dropping the collection will drop all Indexes

mchacki · 2025-07-09T09:43:35Z

tests/js/client/shell/api/rest-schema-handler.js

+            db._drop(COLLECTION_CUSTOMERS);
+            db._drop(COLLECTION_PRODUCTS);
+            db._drop(COLLECTION_COMPANIES);
+            db._drop(EDGE_PURCHASED);
+            db._drop(EDGE_MANUFACTURED);


Good approach, but this does not cover the Indexes or Graphs.

I would suggest the following:
Create a function "tearDown" in Line 59.
Which takes the full implementation of "tearDownAll" below (that one is good and complete!)
Then at this place call tearDown();
And below you set:

tearDownAll: tearDown
This way you need to implement the cleanup only once.
And the setup makes sure nothing is left from any previously failed test.

Ah, great idea! I changed.

mchacki · 2025-07-09T09:58:55Z

tests/js/client/shell/api/rest-schema-handler.js

+            assertEqual(descView.links.length, 2, 'descView should have 2 links');
+        },
+
+        test_InvalidSampleNumValues_ShouldReturn400: function () {


For the following three tests:

test_InvalidSampleNumValues_ShouldReturn400
test_InvalidExampleNumValues_ShouldReturn400
test_ExampleNumGreaterThanSampleNum_ShouldReturn400

They only test the base "schema" endpoint.
I would like to make sure that collection, graph and view are also following this approach.

Therefore can you add another loop in those tests

path = ["", "/collection/products", "/graph/purchaseHistory", "/view/descView"]
path.forEach(p => {
const doc = arango.GET_RAW(api + path + param);
<< your code here >>
})

Just make sure collection / graph / view exists and I did not add a typo in the suggestion.

Sorry this also goes for:

test_ExampleNumZero_ShouldReturnNoExamples
test_SampleNumZero_ShouldReturn400

I wasn't aware of that! I added forEach for all the parameter validation tests not only your suggestions.

mchacki · 2025-07-09T10:21:31Z

arangod/RestHandler/RestSchemaHandler.cpp

+
+    const auto indType = ind.get("type").stringView();
+    if (indType != "primary" && indType != "edge") {
+      builder.add(velocypack::Value(velocypack::ValueType::Object));


The code is correct.

This is not a strict suggestion, just please check if this makes sense or not.

VelocyPack allows for a keep Method.
From the Tests it works like this:

TEST(CollectionTest, KeepSomeAttributes) {
std::string const value(
"{"foo":"bar","baz":"quux","number":1,"boolean":true,"
""empty":null}");

Parser parser;
parser.parse(value);
Slice s(parser.start());

std::vectorstd::string const toKeep = {"foo", "baz", "empty"};
Builder b = Collection::keep(s, toKeep);
s = b.slice();
ASSERT_TRUE(s.isObject());
ASSERT_EQ(3U, s.length());

ASSERT_TRUE(s.hasKey("foo"));
ASSERT_EQ("bar", s.get("foo").copyString());

ASSERT_TRUE(s.hasKey("baz"));
ASSERT_EQ("quux", s.get("baz").copyString());

ASSERT_TRUE(s.hasKey("empty"));
ASSERT_TRUE(s.get("empty").isNull());

ASSERT_FALSE(s.hasKey("number"));
ASSERT_FALSE(s.hasKey("boolean"));
ASSERT_FALSE(s.hasKey("quetzalcoatl"));
}

It should help to keep exact attributes.
Mabye that is easier to use for you?

Convenient method! I changed the code. I thought I could use it for edgeDefinition but it is not a json object so I couldn't use it.

mchacki · 2025-07-09T10:31:48Z

arangod/RestHandler/RestSchemaHandler.cpp

+  const std::string graphQueryString = R"(
+    FOR g IN _graphs
+    RETURN {
+      name: g._key,
+      relations: g.edgeDefinitions
+    }
+  )";


I would like to get away without implementing this query here.
This functionality should be provided by the GraphManager.
I know it does not right now.
But I think it would be easy to add there.

It has an implementation of:
ResultT<std::unique_ptr> lookupGraphByName(
std::string const& name) const;

Which gives us one Graph Object. And Result readGraphs(velocypack::Builder& builder) const;

Which gives us all Graphs in JSON format.

Could you check if you can implement a lookupAllGraphs method?

e.g. ResultT<std::vector<std::unique_ptr>> lookupAllGraphs() const;
By combining the two?
I am okay if the GraphManager would need to run an AQL inside that function.

But this would encapsulate the logic into that one Class only, and it does not leak the information on where Graphs are stored and in which format to any other files.

Refactored the code:

Implemented lookupAllGraphs() in GraphManager class.

getGraphAndCollections() takes Graph& graph and calls graph.edgeDefinitions() to populate the builder object.

getAllGraphsAndCollections() calls _graphManager.lookupAllGraphs() to obtain vector<unique_ptr<Graph>>, and then calls getGraphAndCollections().

mchacki · 2025-07-09T10:32:55Z

arangod/RestHandler/RestSchemaHandler.cpp

+  const std::string graphQueryString = R"(
+    FOR g IN _graphs
+      FILTER g._key == @graphName
+      RETURN {
+        name: g._key,
+        relations: g.edgeDefinitions
+      }
+  )";


Can you use the graphManager lookepGraphByName here?

mchacki · 2025-07-09T10:33:48Z

arangod/RestHandler/RestSchemaHandler.cpp

+constexpr std::string_view moduleName("graph management");
+}
+
+const std::string RestSchemaHandler::queryStr = R"(


Like this query 👍

mchacki · 2025-07-09T10:34:01Z

arangod/RestHandler/RestSchemaHandler.cpp

+using namespace arangodb::rest;
+
+namespace {
+constexpr std::string_view moduleName("graph management");


Suggested change

constexpr std::string_view moduleName("graph management");

constexpr std::string_view moduleName("schema endpoint");

The module name is to describe where the call came from.
So this should be the name of the Schema endpoint

…ema-endpoint

…d js test file

…ema-endpoint

…eHandler

knakatasf added 17 commits June 17, 2025 18:51

feature(api): Added an endpoint _api/schema/<collection-name>

529e205

feature(api): Added an endpoint /_api/schema/<collection-name>

34c78d6

debug(api): Corrected aql and parameter input validation logic

2fdf54e

feature(api): Added /_api/schema endpoint

413607e

feature(api): Added graph insights to /_api/schema endpoint

6a09ec8

chore(test): Added a testing function RestSchemaHandlerTest

fb776d9

chore(RestSchemaHanlderTest): Added a setup and testing fucntion

20f902c

chore(test): Added tests/js/client/shell/api/rest-schema-handler.js

b9fe0f3

feature(api): Refactored schema endpoint

1baaa6f

chore(debug): Corrected the logic of schema endpoint and added testin…

c66f6c4

…g functions

feature(api): Added schema/graph endpoint and error handlings

c749d2b

chore(test): Added schema/graph endpoint tests

84f9458

feature(api): Added /_api/view/<view-name> endpoint

2c2df29

feature(api): Added views attribute to schema endpoint

63b607c

feature(api): Refactored RestSchemaHandler's helper functions' return…

f3e5ab2

… type; added tests for schema/view endpoint

Merge branch 'devel' of github.com:arangodb/arangodb into feature/sch…

7a2644c

…ema-endpoint

chore(test): Added test for rest-schema-handler

f1a2e87

cla-bot bot added the cla-signed label Jul 2, 2025

knakatasf requested a review from mchacki July 2, 2025 17:37

knakatasf self-assigned this Jul 2, 2025

chore(lint): Sanitized the indentations and format of the code

be86a67

Simran-B mentioned this pull request Jul 3, 2025

HTTP API: Schema endpoint arangodb/docs-hugo#736

Open

mchacki reviewed Jul 3, 2025

View reviewed changes

knakatasf added 6 commits July 4, 2025 03:52

chore(docs): Updated CHANGELOG

d1a8b1e

Merge branch 'devel' of github.com:arangodb/arangodb into feature/sch…

388d7d8

…ema-endpoint

chore: sync and merged devel change (README)

b87ce3a

chore: Refactor RestSchemaHandler helper method return types, add use…

d4ecba7

…r permission check, and includeAllFields option

chore(bug): Changed RestSchemaHandler so that it will work in a cluster

b790727

Merge branch 'devel' of github.com:arangodb/arangodb into feature/sch…

e35bc95

…ema-endpoint

knakatasf added 2 commits July 8, 2025 23:54

feature(api): Refactored RestSchemaHandler for better efficiency and …

96db1b7

…added tests

chore(test): Edited rest-schema-handler.js

5bb7742

mchacki requested changes Jul 9, 2025

View reviewed changes

knakatasf added 7 commits July 9, 2025 17:54

Merge branch 'devel' of github.com:arangodb/arangodb into feature/sch…

277704a

…ema-endpoint

chore(api): Refactored RestSchemaHandler (graphs, orphans) and change…

2415163

…d js test file

chore(test): Refined and added some tests

e05be98

chore: Refactored RestSchemaHandler

9db5379

Merge branch 'devel' of github.com:arangodb/arangodb into feature/sch…

bb31366

…ema-endpoint

chore: Refactored RestSchemaHander so it inherits from RestVocbaseBas…

df1a4d1

…eHandler

chore: Organized GeneralServerFeature.cpp

0e90e5f

	* Add new API GET /_api/schema to show graphs, views, collections along with
	* Add new API GET /_api/schema to show the sampled schema of graphs, views, collections along with

		/// Copyright 2014-2025 ArangoDB GmbH, Cologne, Germany
		/// Copyright 2004-2014 triAGENS GmbH, Cologne, Germany

	/// Copyright 2014-2025 ArangoDB GmbH, Cologne, Germany
	/// Copyright 2004-2014 triAGENS GmbH, Cologne, Germany
	/// Copyright 2025-2025 ArangoDB GmbH, Cologne, Germany

		productsCollection.dropIndex(prodIndex);
		customersCollection.dropIndex(cusIndex);

	constexpr std::string_view moduleName("graph management");
	constexpr std::string_view moduleName("schema endpoint");

Feature/schema endpoint #21836

Are you sure you want to change the base?

Feature/schema endpoint #21836

Uh oh!

Conversation

knakatasf commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Scope & Purpose

Checklist

Related Information

Uh oh!

mchacki left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

knakatasf Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

knakatasf Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dothebart commented Jul 3, 2025

Uh oh!

mchacki left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

knakatasf Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

knakatasf commented Jul 2, 2025 •

edited

Loading

knakatasf Jul 3, 2025 •

edited

Loading

knakatasf Jul 3, 2025 •

edited

Loading

knakatasf Jul 9, 2025 •

edited

Loading