Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISSUES-5436 support custom http #7572

Merged
merged 19 commits into from Apr 28, 2020

Conversation

zhang2014
Copy link
Contributor

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

For changelog. Remove if this is non-significant change.

Category (leave one):

  • New Feature

Short description (up to few sentences):
#5436

@zhang2014 zhang2014 force-pushed the feature/ISSUES-5436 branch 11 times, most recently from cfdaffd to 8e1d6ed Compare November 8, 2019 11:39
@zhang2014
Copy link
Contributor Author

zhang2014 commented Nov 11, 2019

@ValBaturin can you help me review this pr?

cc: @alexey-milovidov @millb

@zhang2014 zhang2014 marked this pull request as ready for review November 11, 2019 14:38
@zhang2014 zhang2014 requested a review from a team November 11, 2019 14:38
@ghost ghost requested review from millb and removed request for a team November 11, 2019 14:38
@zhang2014 zhang2014 force-pushed the feature/ISSUES-5436 branch 2 times, most recently from 810e29a to 96d6cdf Compare November 12, 2019 05:22
@ValBaturin
Copy link
Contributor

@zhang2014 sure

Copy link
Contributor

@ValBaturin ValBaturin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like most of the logic is implemented in CustomHTTP interpreter and http handler from server partially moved to the interpreter. I haven't fully read it yet though.
In my opinion, customHTTP name is misleading, and I'd suggest something like predefined queries.
I'll finish the review by tomorrow.

@@ -26,6 +26,12 @@ struct HTMLForm : public Poco::Net::HTMLForm
readUrl(istr);
}

template <typename T>
bool check(const std::string & key, T check_value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -186,6 +189,9 @@ struct ContextShared
std::unique_ptr<Clusters> clusters;
ConfigurationPtr clusters_config; /// Soteres updated configs
mutable std::mutex clusters_mutex; /// Guards clusters and clusters_config
std::unique_ptr<CustomExecutors> custom_executors;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to separate this section from clusters(187 - 191) by a newline or a comment line or just move it somewhere else. Otherwise it looks like it's all about clusters for distributed tables.

@@ -8,6 +8,3 @@ URL="${CLICKHOUSE_PORT_HTTP_PROTO}://${CLICKHOUSE_HOST}:${CLICKHOUSE_PORT_HTTP}/
${CLICKHOUSE_CURL} -vsS ${URL} --data-binary @- <<< "SELECT 1" 2>&1 | perl -lnE 'print if /Keep-Alive/';
${CLICKHOUSE_CURL} -vsS ${URL} --data-binary @- <<< " error here " 2>&1 | perl -lnE 'print if /Keep-Alive/';
${CLICKHOUSE_CURL} -vsS ${URL}ping 2>&1 | perl -lnE 'print if /Keep-Alive/';

# no keep-alive:
${CLICKHOUSE_CURL} -vsS ${URL}404/not/found/ 2>&1 | perl -lnE 'print if /Keep-Alive/';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 404 test will be move to integration test. for no keep-alive we are currently not compatible.

@@ -6,6 +6,7 @@
#include <DataTypes/IDataType.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file changed unintentionally?

Copy link
Contributor

@ValBaturin ValBaturin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Executor family doesn't fit into the current interpreter design. HTTPInputStreams and HTTPOutputStreams should be defined in DataStreams or at least in server. CustomHTTP name as it was already mentioned above is misleading e.g. it might mean an interpreter of modification of http protocol but an interpreter of custom HTTP handlers with prepared queries. I also don't really get why CustomExecutor is named that way.
To be honest I'm not sure if it's a job of an interpreter at all. Yeah, I see that you do some AST modifications and it's indeed an interpretation stage but overall dealing with HTTPServerRequest and HTTPServerResponse feels more like a server job.
I'd be happy to hear your elaboration on such a design decision and any thoughts of other people who've been involved in the project for a while or have a better grasp on the subject.

class CustomExecutor
{
public:
bool canBeParseRequestBody() const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must be a typo, right?

@@ -35,6 +35,10 @@
#include <DataStreams/IBlockInputStream.h>
#include <Interpreters/executeQuery.h>
#include <Interpreters/Quota.h>
#include <Interpreters/CustomHTTP/CustomExecutor.h>
#include <Interpreters/CustomHTTP/HTTPInputStreams.h>
#include <Interpreters/CustomHTTP/ExtractorClientInfo.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extractor Client Info is only used in here. I'd suggest to make it local since it shouldn't be in interpreter anyway.

@zhang2014
Copy link
Contributor Author

zhang2014 commented Nov 15, 2019

@ValBaturin Thanks your review

Executor family doesn't fit into the current interpreter design. HTTPInputStreams and HTTPOutputStreams should be defined in DataStreams or at least in server.

I move them into DataStreams

CustomHTTP name as it was already mentioned above is misleading e.g. it might mean an interpreter of modification of http protocol but an interpreter of custom HTTP handlers with prepared queries. I also don't really get why CustomExecutor is named that way.

I'll rename them. what do you recommend?

As mentioned above, I rename them to predefined queries.

To be honest I'm not sure if it's a job of an interpreter at all. Yeah, I see that you do some AST modifications and it's indeed an interpretation stage but overall dealing with HTTPServerRequest and HTTPServerResponse feels more like a server job.
I'd be happy to hear your elaboration on such a design decision and any thoughts of other people who've been involved in the project for a while or have a better grasp on the subject.

Because we put the CustomExecutor into context, if we put the CustomExecutor into the programs the will appear interpreter dependent programs

Now they are refactored into the programs

@zhang2014 zhang2014 force-pushed the feature/ISSUES-5436 branch 4 times, most recently from 4931649 to 84e6c08 Compare November 22, 2019 10:43
@zhang2014
Copy link
Contributor Author

@ValBaturin Can you help me review this PR again?

@zhang2014 zhang2014 force-pushed the feature/ISSUES-5436 branch 2 times, most recently from 2d26e4e to 749401b Compare November 22, 2019 12:23
@zhang2014 zhang2014 changed the title [WIP] ISSUES-5436 support custom http ISSUES-5436 support custom http Apr 27, 2020
@zhang2014
Copy link
Contributor Author

@alexey-milovidov @filimonov

Sorry for delay. I think this pr is ready for review

@qoega
Copy link
Member

qoega commented Apr 27, 2020

Can you remove also here

PingRequestHandler.cpp

@alexey-milovidov
Copy link
Member

Current reviewer is @tavplubix

@tavplubix tavplubix mentioned this pull request Apr 27, 2020
@tavplubix tavplubix merged commit d56002b into ClickHouse:master Apr 28, 2020
static inline Poco::Net::HTTPRequestHandlerFactory * createHTTPHandlerFactory(IServer & server, const std::string & name, AsynchronousMetrics & async_metrics)
{
if (server.config().has("routing_rules"))
return createHandlersFactoryFromConfig(server, name, "routing_rules");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the intention that http_handlers (it was renamed in merging PR - #10547) overrides default paths? i.e. /ping, /replicas_status
Maybe it will better to use separate server for custom http handlers (on another port)? Or register custom http handlers first and then defaults, to disallow their overrides (or allow overrides, but register anyway)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@filimonov
Copy link
Contributor

filimonov commented Jun 11, 2020

@azat - but standard paths are still there:
https://github.com/ClickHouse/ClickHouse/pull/10547/files#diff-5981515e3ab857757e2ccc763c560d2cR121-R128

auto factory = (new HTTPRequestHandlerFactoryMain(name))
->addHandler((new HandlingRuleHTTPHandlerFactory<StaticRequestHandler>(server, root_response_expression))
->attachStrictPath("/")->allowGetAndHeadRequest())
->addHandler((new HandlingRuleHTTPHandlerFactory<StaticRequestHandler>(server, ping_response_expression))
->attachStrictPath("/ping")->allowGetAndHeadRequest())
->addHandler((new HandlingRuleHTTPHandlerFactory<ReplicasStatusHandler>(server))
->attachNonStrictPath("/replicas_status")->allowGetAndHeadRequest())
->addHandler((new HandlingRuleHTTPHandlerFactory<DynamicQueryHandler>(server, "query"))->allowPostAndGetParamsRequest());
if (server.config().has("prometheus") && server.config().getInt("prometheus.port", 0) == 0)
factory->addHandler((new HandlingRuleHTTPHandlerFactory<PrometheusRequestHandler>(
server, PrometheusMetricsWriter(server.config(), "prometheus", async_metrics)))
->attachStrictPath(server.config().getString("prometheus.endpoint", "/metrics"))->allowGetAndHeadRequest());
return factory;

or you mean that it can be overwritten by a custom config?

@azat
Copy link
Collaborator

azat commented Jun 11, 2020

@azat - but standard paths are still there:

I see, but it is under else, so with enabled http_handlers they do not work

@filimonov
Copy link
Contributor

@azat - but standard paths are still there:

I see, but it is under else, so with enabled http_handlers they do not work

Indeed. That's crap.

We need to push standard handers (ping / replica_status / prometheus) before user-defined handlers always.

@alexey-milovidov
Copy link
Member

If I remember correctly, custom HTTP handlers can be also used to disable or replace default HTTP handlers. It makes sense because it allows to provide limited access to the server. And to keep default HTTP handlers along with custom, you have to add them to configuration.

@tavplubix am I right?

@tavplubix
Copy link
Member

Yes, user may want to configure default handlers other way and it can be done with custom handlers, so current behaviour seems ok to me.

@filimonov
Copy link
Contributor

filimonov commented Jun 11, 2020

I though default configuration lists a way how to configure /ping & /replica_status to return them back, but it's not. For /replicas_status it seems not possible, same for /prometeus

@tavplubix
Copy link
Member

I will add more configuration examples and ability to configure ReplicasStatusHandler and PrometheusRequestHandler.

@filimonov
Copy link
Contributor

filimonov commented Jun 11, 2020

OK.

Actually there is one more option - add standard handers to the handers from config always at the end.

So user will be able to overwrite them by his own handlers which will have higher priority.
For example, if the user will decide to remove /ping handler he will just configure.

        <rule>
            <url>/ping</url>
            <handler>
                <type>static</type>
                 <status>404</status>
            </handler>
        </rule>

and it will have higher priority.

@filimonov
Copy link
Contributor

And i think the more common use case is to extend standatd handlers, not replace all of them.

@alexey-milovidov
Copy link
Member

@filimonov There is a use case - to build limited API without access to default handlers.

We can:

  • provide example directly in config.xml on how to configure default handlers;
  • add support for <default/> specification on the same level as <rule> to automatically enable all default handlers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature Pull request with new product feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet