Skip to content

Use git submodule to download pulsar-client-cpp dependency #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

BewareMyPower
Copy link
Contributor

Modifications

  • Add pulsar-client-cpp as the submodule
  • Add build-client-cpp.sh to build the Pulsar C++ client.
  • Fix the CMakeLists.txt to make Python client build successfully. Instead of linking the pulsarStatic target, link to the static library built by build-client-cpp.sh.
  • Add README to describe how to build Python client.

Currently the Python client uses some C++ headers not exposed (i.e. in the pulsar-client-cpp/lib directory). This PR added the pulsar-client-cpp path prefix. We should remove these includes in future.

### Modifications
- Add [pulsar-client-cpp](https://github.com/apache/pulsar-client-cpp)
  as the submodule
- Add `build-client-cpp.sh` to build the Pulsar C++ client.
- Fix the CMakeLists.txt to make Python client build successfully.
  Instead of linking the `pulsarStatic` target, link to the static
  library built by `build-client-cpp.sh`.
- Add README to describe how to build Python client.

Currently the Python client uses some C++ headers not exposed (i.e. in
the `pulsar-client-cpp/lib` directory). This PR added the
`pulsar-client-cpp` path prefix. We should remove these includes in
future.
@BewareMyPower
Copy link
Contributor Author

@RobertIndie @merlimat @eolivelli Could you take a second look? I believe this PR is more universal than #1 that focuses on the script to build Python client on macOS. This PR also avoids many code changes (include CMakeLists.txt)

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried on my laptop.
It works great.

image

Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the submodule approach too much.

  1. Git submodule brings its own set of problems
  2. We must not use a random version from C++ client main branch. It might not be a stable version and it would be difficult to track down the particular c++ code while debugging. (eg: the client version string will be misleading/useless)
  3. When we're building a source tar.gz we would have to include C++ code, but that it's really not ideal
  4. The Python wrapper shouldn't depend on any internal code from the C++ library, just the public API
  5. When building Python client, we shouldn't be building the C++ client lib. Instead we should just fetch the binaries that are already available (eg: RPM, Deb, APK, etc.. ). (Homebrew is a bit a of an exception since it's not statically linked).

One of the main reason to decouple C++ and Python from Java is to make their release process much more easier. We now have the freedom of doing releases with much less overhead and we can release new features much quicker than before.


-->

# Pulsar Python client library
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README should be focused on users, rather than developers.

We should list all the platforms/OS/Python versions that are supported, how to install the wheels and how to get started.

We can a have a link to a developer page where there are instructions to compile it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documents here are only related to changes of this PR. We should use another PR to add more content into README.

@BewareMyPower
Copy link
Contributor Author

Hi @merlimat, here are my points for some of your concerns.

For the most important issue (IMO) you've mentioned

We must not use a random version from C++ client main branch.

The main branch is not guaranteed to be stable. We should only depend the stable version of C++ client for formal releases. For example, git submodule can specify a tag of a formal C++ client release. When we release a Python client, we can specify the tag like

[submodule "pulsar-client-cpp"]
	path = pulsar-client-cpp
	url = https://github.com/apache/pulsar-client-cpp.git
        tag = v3.0.0-pre-release-1

The main branch should also relies on a tag. However, there is no tag in the new created pulsar-client-cpp repo. Should we create one and rely on it? Or I think we can reset to a specific commit in the script here.


When we're building a source tar.gz we would have to include C++ code
The Python wrapper shouldn't depend on any internal code from the C++ library, just the public API

I agree. Just want to remove the dependency in another PR. I will do that in this PR soon.

When building Python client, we shouldn't be building the C++ client lib. Instead we should just fetch the binaries that are already available (eg: RPM, Deb, APK, etc.. )

However, it makes the release or Python client slow.

@BewareMyPower BewareMyPower marked this pull request as draft September 30, 2022 03:53
@BewareMyPower
Copy link
Contributor Author

BewareMyPower commented Sep 30, 2022

Since most of the codes will be the same with #1, I marked this PR as draft and will make further changes after #1 is merged.

In future, this PR title might become "Add cpp client as a submodule" or "Add Instructions of building Python client from source"

@BewareMyPower
Copy link
Contributor Author

We now have the freedom of doing releases with much less overhead and we can release new features much quicker than before.

I'm wondering how to manage the releases for C++ client now. If the main branch must depend on a released C++ client, then, if we want to support a new feature, we have to release the C++ client first. Then each C++ feature requires a release. I'm afraid that it's not a good way for release even if the release could be much quicker than before. We might have:

  • pulsar-cpp-3.0.0-pre-feature1
  • pulsar-cpp-3.0.0-pre-feature2
  • pulsar-cpp-3.0.0-pre-feature3
  • ...

Adding a git tag for them might be good (but I think it could still create a lot of tags). But should we create each binary uploaded for a single feature? @merlimat

@BewareMyPower BewareMyPower changed the title PIP-209: Support build Python client wrapper Use git submodule to download pulsar-client-cpp dependency Sep 30, 2022
@merlimat
Copy link
Contributor

When we release a Python client, we can specify the tag like

Exactly, but then it would be the same as depending on a released version of C++ client. Having the submodule won't help here.

if we want to support a new feature, we have to release the C++ client first.

Yes, if the feature is significant enough to grant a C++ release, then we just do the 2 releases (and also NodeJS and maybe other wrappers).

If the feature is not big enough, we can wait to include other changes in C++ before releasing. We would still be in a completely different situation from where we are today, where the Python release is tied to a huge Pulsar release that includes everything.

If we automate all the steps, we are now in a position to do C++ release with minimal effort, therefore we can do them very frequently.

@BewareMyPower
Copy link
Contributor Author

Having the submodule won't help here.

I have thought that we can depend on an unstable version of pulsar-client-cpp in main branch, then for those want some specific features (or bug fixes), they can build the Python client from source.

But just thinking again, they can also build the unreleased C++ client from source. Then build the unreleased Python client from source based on the C++ client built by themselves. But this way should not be encouraged anyway unless the demands are very urgent.

It looks like Pulsar and BK, but C++ client can be released very quickly, not like BK. So I will close this PR.

FYI, @codelipenghui @Demogorgon314 @shibd

@tisonkun
Copy link
Member

tisonkun commented Oct 1, 2022

thinking again

Agree. Submodule increase understanding burdens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants