Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receive link not reopened after being remote closed #219

Closed
markvandenbergh opened this issue May 1, 2018 · 25 comments
Closed

Receive link not reopened after being remote closed #219

markvandenbergh opened this issue May 1, 2018 · 25 comments
Assignees
Milestone

Comments

@markvandenbergh
Copy link

We have the following stacktrace which suggests the receive link will be reopened, however this does not seem to happen as the messages are no longer retrieved from the subscription until a restart.

2018-04-19 15:54:35,313 [22-b0f1-9201120a7d67] ERROR c.m.a.s.p.CoreMessageReceiver - Receive link to 'ResponseMessages/subscriptions/mysubscription', sessionId 'null' closed with error.
com.microsoft.azure.servicebus.primitives.ServiceBusException: Error{condition=null, description='null', info=null}
        at com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:113) ~[azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.primitives.CoreMessageReceiver.onError(CoreMessageReceiver.java:764) [azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.primitives.CoreMessageReceiver.onClose(CoreMessageReceiver.java:939) [azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:68) [azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.onLinkRemoteClose(BaseLinkHandler.java:42) [azure-servicebus-1.1.1.jar:na]
        at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:176) [proton-j-0.22.0.jar:na]
        at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) [proton-j-0.22.0.jar:na]
        at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:309) [proton-j-0.22.0.jar:na]
        at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:276) [proton-j-0.22.0.jar:na]
        at com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:451) [azure-servicebus-1.1.1.jar:na]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_162]
2018-04-19 15:54:35,313 [22-b0f1-9201120a7d67] ERROR c.m.a.s.p.CoreMessageReceiver - Receive link to 'ResponseMessages/subscriptions/mysubscription', sessionId 'null' will be reopened after 'PT1S'

The logging level of this class was set at error when this happened (two or three times now), we are going to change the logging level to get more detail on this when it happens again. I will update this ticket once we get more detail.

Actual Behavior

  1. Receive link is remotely closed
  2. Receive link is not reopened

Expected Behavior

  1. Receive link should be reopened after a disconnect

Versions

  • OS platform and version: Ubuntu 16.04.4 LTS
  • Maven package version or commit ID: 1.1.1
@markvandenbergh
Copy link
Author

It happened again, we have extra logging now:

2018-05-04 09:31:49,862 [af-80ed-69e51d4ef474] DEBUG c.m.a.s.p.CoreMessageReceiver - Cancelled SAS Token renew timer
2018-05-04 09:31:49,862 [af-80ed-69e51d4ef474] ERROR c.m.a.s.p.CoreMessageReceiver - Receive link to 'ResponseMessages/subscriptions/mysubscription3', sessionId 'null' closed with error.
com.microsoft.azure.servicebus.primitives.ServiceBusException: Error{condition=null, description='null', info=null}
        at com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:113) ~[azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.primitives.CoreMessageReceiver.onError(CoreMessageReceiver.java:764) [azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.primitives.CoreMessageReceiver.onClose(CoreMessageReceiver.java:939) [azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:68) [azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.onLinkRemoteClose(BaseLinkHandler.java:42) [azure-servicebus-1.1.1.jar:na]
        at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:176) [proton-j-0.22.0.jar:na]
        at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) [proton-j-0.22.0.jar:na]
        at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:309) [proton-j-0.22.0.jar:na]
        at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:276) [proton-j-0.22.0.jar:na]
        at com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:451) [azure-servicebus-1.1.1.jar:na]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
2018-05-04 09:31:49,862 [af-80ed-69e51d4ef474] ERROR c.m.a.s.p.CoreMessageReceiver - Receive link to 'ResponseMessages/subscriptions/mysubscription3', sessionId 'null' will be reopened after 'PT1S'
2018-05-04 09:31:50,863 [pool-9-thread-2] INFO  c.m.a.s.p.CoreMessageReceiver - Recreating receive link to 'ResponseMessages/subscriptions/mysubscription3'
2018-05-04 09:31:50,868 [ommandBusExecutor-29] INFO  o.a.c.callbacks.LoggingCallback - Command executed successfully: io.motown.domain.api.chargingstation.HeartbeatCommand
2018-05-04 09:31:50,868 [Thread-12073969] DEBUG c.m.a.s.p.CoreMessageReceiver - Sent SAS Token and set renew timer
2018-05-04 09:31:50,869 [af-80ed-69e51d4ef474] INFO  c.m.a.s.p.CoreMessageReceiver - Creating receive link to 'ResponseMessages/subscriptions/mysubscription3'
2018-05-04 09:31:50,869 [af-80ed-69e51d4ef474] DEBUG c.m.a.s.p.CoreMessageReceiver - Receive link settle mode 'sender settle mode: UNSETTLED, receiver settle mode: SECOND'
2018-05-04 09:31:51,435 [pool-9-thread-4] WARN  c.m.a.s.p.CoreMessageReceiver - No messages received from 'ResponseMessages/subscriptions/mysubscription3'. Pending receive request timed out. Returning null to the client.
2018-05-04 09:31:51,435 [Thread-12073977] DEBUG c.m.a.s.p.CoreMessageReceiver - Receiving maximum of '1' messages from 'ResponseMessages/subscriptions/mysubscription3'
2018-05-04 09:31:51,435 [af-80ed-69e51d4ef474] DEBUG c.m.a.s.p.CoreMessageReceiver - Sent flow to the service. receiverPath:ResponseMessages/subscriptions/mysubscription3, linkname:3e25e3_15c2d55ef267404d8ca222c6ffd1d8bc_G30, updated-link-credit:200, sentCredits:200
2018-05-04 09:31:56,474 [pool-9-thread-4] WARN  c.m.a.s.p.CoreMessageReceiver - No messages received from 'ResponseMessages/subscriptions/mysubscription3'. Pending receive request timed out. Returning null to the client.
2018-05-04 09:31:56,474 [Thread-12074027] DEBUG c.m.a.s.p.CoreMessageReceiver - Receiving maximum of '1' messages from 'ResponseMessages/subscriptions/mysubscription3'

Not sure if this is related:

2018-05-04 09:31:57,936 [af-80ed-69e51d4ef474] WARN  c.m.a.s.amqp.ConnectionHandler - Connection.onTransportError: hostname:ourhostname.servicebus.windows.net:5671, error:connection aborted
2018-05-04 09:31:57,936 [af-80ed-69e51d4ef474] ERROR c.m.a.s.primitives.MessagingFactory - Connection error. 'Error{condition=amqp:connection:framing-error, description='connection aborted', info=null}'
2018-05-04 09:31:57,936 [af-80ed-69e51d4ef474] INFO  c.m.a.s.primitives.MessagingFactory - Closing connection to host
2018-05-04 09:31:57,937 [af-80ed-69e51d4ef474] ERROR c.m.a.s.p.RequestResponseLink - Internal send link of requestresponselink to '$cbs' closed with error.
com.microsoft.azure.servicebus.primitives.ServiceBusException: Error{condition=amqp:connection:framing-error, description='connection aborted', info=null}
        at com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:113) ~[azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.primitives.RequestResponseLink$InternalSender.onClose(RequestResponseLink.java:743) ~[azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:68) [azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.primitives.MessagingFactory.closeConnection(MessagingFactory.java:343) [azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.primitives.MessagingFactory.onConnectionError(MessagingFactory.java:267) [azure-servicebus-1.1.1.jar:na]
        at com.microsoft.azure.servicebus.amqp.ConnectionHandler.onTransportError(ConnectionHandler.java:90) [azure-servicebus-1.1.1.jar:na]
        at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:191) [proton-j-0.22.0.jar:na]
        at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) [proton-j-0.22.0.jar:na]
        at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:309) [proton-j-0.22.0.jar:na]
        at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:276) [proton-j-0.22.0.jar:na]
        at com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:451) [azure-servicebus-1.1.1.jar:na]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
2018-05-04 09:32:01,943 [pool-9-thread-1] WARN  c.m.a.s.p.CoreMessageReceiver - No messages received from 'ResponseMessages/subscriptions/mysubscription3'. Pending receive request timed out. Returning null to the client.
2018-05-04 09:32:01,944 [Thread-12074121] DEBUG c.m.a.s.p.CoreMessageReceiver - Receiving maximum of '1' messages from 'ResponseMessages/subscriptions/mysubscription3'
2018-05-04 09:32:02,458 [pool-9-thread-1] WARN  c.m.a.s.p.CoreMessageReceiver - No messages received from 'ResponseMessages/subscriptions/mysubscription3'. Pending receive request timed out. Returning null to the client.
2018-05-04 09:32:02,458 [Thread-12074138] DEBUG c.m.a.s.p.CoreMessageReceiver - Receiving maximum of '1' messages from 'ResponseMessages/subscriptions/mysubscription3'

Looking at the "No message received etc" logging after the error the code seems to think the connection is (still) ok. However the subscription is getting filled with messages that are no longer retrieved.

We have a multi-instance solution (all with their own subscription on the same service bus) and I see this message at the same time at all of them:

2018-05-04 09:31:49,862 [af-80ed-69e51d4ef474] ERROR c.m.a.s.p.CoreMessageReceiver - Receive link to 'ResponseMessages/subscriptions/mysubscription<number>', sessionId 'null' closed with error.

The other instances all continued processing messages after that error, just one did not. It's not always the same instance that fails.

screenshot 2018-05-04 10 06 21

@hope4555
Copy link

Same happens to us
After sometime the connection is lost and messages start to pile up on the subscription
only if we disconnect and reconnect periodically we can overcome this

@las3r
Copy link

las3r commented Jun 22, 2018

I have the exact same problem as noted in #224. I think even more of a coincidence is that @markvandenbergh and I happen to be coworkers (we didn't know we had the same issues).

For completeness sake:

@yvgopal We have the exact same issue. Connection closed due to inactivity, but then on reconnection we are getting the same stacktrace. This results in a situation where it stops processing messages and we need to restart our application in order to get it to process messages again.

Waiting for it to be automatically reconnected doesn't solve the problem.

Is there a workaround for this, or a way for us to handle these exceptions and manually reconnect?

2018-06-22 09:03:19.024 ERROR 51951 --- [62-6fc9901c7a4d] c.m.a.s.primitives.MessagingFactory      : Connection error. 'Error{condition=amqp:connection:forced, description='The connection was inactive for more than the allowed 60000 milliseconds and is closed by container 'LinkTracker'. TrackingId:bd8776fd498040c2b84f12df0705928c_G20, SystemTracker:gateway7, Timestamp:6/22/2018 9:04:08 AM', info=null}'
2018-06-22 09:03:19.025  INFO 51951 --- [62-6fc9901c7a4d] c.m.a.s.primitives.MessagingFactory      : Closing connection to host
2018-06-22 09:03:19.025 ERROR 51951 --- [62-6fc9901c7a4d] c.m.a.s.primitives.CoreMessageReceiver   : Receive link to 'OUR SUB' closed with error.
 
com.microsoft.azure.servicebus.primitives.ServiceBusException: Error{condition=amqp:connection:forced, description='The connection was inactive for more than the allowed 60000 milliseconds and is closed by container 'LinkTracker'. TrackingId:bd8776fd498040c2b84f12df0705928c_G20, SystemTracker:gateway7, Timestamp:6/22/2018 9:04:08 AM', info=null}
        at com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:113) ~[azure-servicebus-1.0.0.jar!/:na]
        at com.microsoft.azure.servicebus.primitives.CoreMessageReceiver.onError(CoreMessageReceiver.java:767) [azure-servicebus-1.0.0.jar!/:na]
        at com.microsoft.azure.servicebus.primitives.CoreMessageReceiver.onClose(CoreMessageReceiver.java:924) [azure-servicebus-1.0.0.jar!/:na]
        at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:68) [azure-servicebus-1.0.0.jar!/:na]
        at com.microsoft.azure.servicebus.primitives.MessagingFactory.closeConnection(MessagingFactory.java:341) [azure-servicebus-1.0.0.jar!/:na]
        at com.microsoft.azure.servicebus.primitives.MessagingFactory.onConnectionError(MessagingFactory.java:266) [azure-servicebus-1.0.0.jar!/:na]
        at com.microsoft.azure.servicebus.amqp.ConnectionHandler.onConnectionRemoteClose(ConnectionHandler.java:117) [azure-servicebus-1.0.0.jar!/:na]
        at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:128) [proton-j-0.13.1.jar!/:na]
        at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) [proton-j-0.13.1.jar!/:na]
        at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:307) [proton-j-0.13.1.jar!/:na]
        at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:275) [proton-j-0.13.1.jar!/:na]
        at com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:448) [azure-servicebus-1.0.0.jar!/:na]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
 
2018-06-22 09:03:19.025 ERROR 51951 --- [62-6fc9901c7a4d] c.m.a.s.primitives.CoreMessageReceiver   : Receive link to 'OURSUB' will be reopened after 'PT1.477121254S'
2018-06-22 09:10:22.545 ERROR 51951 --- [92-7be7d924b752] c.m.a.s.primitives.CoreMessageSender     : Opending send link to 'OurTopic' failed
 
com.microsoft.azure.servicebus.primitives.ServiceBusException: com.microsoft.azure.servicebus.amqp.AmqpException: The link 'G10:2944348:06fc97_566d964c4bbb439cb0a773a30497e04b_G10' is force detached by the broker due to errors occurred in publisher(link1784516). Detach origin: AmqpMessagePublisher.IdleTimerExpired: Idle timeout: 00:10:00. TrackingId:6ea7496d0000001b001b3ac45b2cbad7_G10_B13, SystemTracker:tallweuacafmsbn01:Topic:OURTOPIC, Timestamp:6/22/2018 9:11:11 AM
        at com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:86) ~[azure-servicebus-1.0.0.jar!/:na]
        at com.microsoft.azure.servicebus.primitives.CoreMessageSender.onClose(CoreMessageSender.java:413) ~[azure-servicebus-1.0.0.jar!/:na]
        at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:68) [azure-servicebus-1.0.0.jar!/:na]
        at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.onLinkRemoteClose(BaseLinkHandler.java:42) [azure-servicebus-1.0.0.jar!/:na]
        at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:176) [proton-j-0.13.1.jar!/:na]
        at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) [proton-j-0.13.1.jar!/:na]
        at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:307) [proton-j-0.13.1.jar!/:na]
        at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:275) [proton-j-0.13.1.jar!/:na]
        at com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:448) [azure-servicebus-1.0.0.jar!/:na]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
Caused by: com.microsoft.azure.servicebus.amqp.AmqpException: The link 'G10:2944348:06fc97_566d964c4bbb439cb0a773a30497e04b_G10' is force detached by the broker due to errors occurred in publisher(link1784516). Detach origin: AmqpMessagePublisher.IdleTimerExpired: Idle timeout: 00:10:00. TrackingId:6ea7496d0000001b001b3ac45b2cbad7_G10_B13, SystemTracker:tallweuacafmsbn01:Topic:OURTOPIC, Timestamp:6/22/2018 9:11:11 AM
        ... 10 common frames omitted

Our MessageHandler looks like this:

public class OurSubscriber<E> implements IMessageHandler {

	private static final Logger LOGGER = LoggerFactory.getLogger(OurSubscriber.class);

	private final Class<E> messageClass;

	private final Consumer<E> processor;

	public OurSubscriber(Class<E> messageClass, Consumer<E> processor) {
		this.messageClass = messageClass;
		this.processor = processor;
	}
	
	@Override
	public CompletableFuture<Void> onMessageAsync(IMessage message) {
		final String messageString = new String(message.getBody(), StandardCharsets.UTF_8);
		
		LOGGER.info("Received event: {}", messageString);

		try {
			E model = new ObjectMapper().readValue(messageString, messageClass);
			processor.accept(model);
		} catch (IOException e) {
			LOGGER.error("Unable to convert JSON to [{}]", messageClass.getSimpleName(), e);
		}

		return CompletableFuture.completedFuture(null);
	}

	@Override
	public void notifyException(Throwable exception, ExceptionPhase phase) {
		LOGGER.warn("Unexpected exception occurred", exception);
	}
}

@tobiasgwaaler
Copy link

Just want to chime in and say how bothersome this issue is. We're currently reconnecting every 10 minutes, by creating a new client and closing the old one, as a workaround, but it's pretty messy and should not be necessary when using a service we're actually paying for.

@las3r
Copy link

las3r commented Jun 22, 2018

Our .NET applications (both function apps and other applications) have none of these problems, so it must be the java library that's causing issues.

@nemakam
Copy link
Contributor

nemakam commented Jun 22, 2018

We will take a look and get back to you

@yvgopal yvgopal self-assigned this Jun 22, 2018
@yvgopal
Copy link
Member

yvgopal commented Jun 25, 2018

@markvandenbergh @hope4555 @las3r @tobiasgwaaler
I am trying to reproduce this issue myself. I need some more information from you guys.

  1. The trace and link getting closed on idle tells me that you should be using a MessageReceiver. Am I right? MessageReceiver is different from QueueClient or SubscriptionClient in the sense that MessageReceiver only has Receive() methods where Clients only have OnMessage pump. OnMessage pump never leaves the receive link idle. So this link getting closed on idle shouldn't happen if you are using SubscriptionClient. @las3r pasted code for his onMessage handler. Are you running into this issue even when you are using only SubscritionClient? An answer to this question will really help me understand if that is one issue or multiple issues manifesting as one.
  2. How often do you see this happen? And is it consistent or random? How long will your receivers remain idle before you run into this problem?
  3. Is your entity (queue or topic) partitioned or non-partitioned?
  4. What is the client version you are using?
  5. Do you run into any network issues which cause random disconnects?

@las3r
Copy link

las3r commented Jun 26, 2018

I'll try to answer to the best of my ability @yvgopal, perhaps @markvandenbergh can answer this as well :).

  1. This one I'll ask to my team, I'll get back to you on this one
  2. Very very random. We've got over 50 java consumers in our application landscape, and we've seen this happen a few times (5-10 times). Sometimes it's after a week, sometimes after a day. What we've seen is that once this happens, it starts to happen more to the same subscriber (we've restarted on of them for the third time in 3 days).
  3. Non-partitioned
  4. azure-spring-boot/0.2.3 >> SDK version 1.0.0
  5. None at all.

@markvandenbergh
Copy link
Author

  1. We are using SubscriptionClient
  2. Very random, sometimes twice a week, sometimes once in three weeks. We have less consumers than @las3r (I think around 15)
  3. Non-partitioned
  4. 1.1.1
  5. Not that we know off

@tobiasgwaaler
Copy link

  1. We're using SubscriptionClient
  2. Seems random. Can be anywhere between a couple of days, to weeks, between the incidents.
  3. Non-partitioned
  4. 1.1.1
  5. A couple of the times I suspect there was network issues, where the application was not able to reach the service-bus server for a while. But I think we've seen it even when there were no network issues.

Reproducing this is difficult, but I'm wondering if this issue is related to another issue: if the application is unable to reach the service-bus server for some time (say 5 minutes), it will not be able to reconnect once the network is working again. You can reproduce this by running the code below, and simply disconnecting your network entirely for a while, then reconnect. The subscription will not reconnect after this.

import com.microsoft.azure.servicebus.*;
import com.microsoft.azure.servicebus.primitives.ConnectionStringBuilder;
import com.microsoft.azure.servicebus.primitives.ServiceBusException;

import java.util.concurrent.CompletableFuture;

public class TopicAndSubscription {

	public static void main(String[] args) throws ServiceBusException, InterruptedException {
		// insert your test topic config
		ConnectionStringBuilder topicConStringBuilder = new ConnectionStringBuilder(
				"",
				"",
				"",
				"");

		// insert your subcription config
		ConnectionStringBuilder subConStringBuilder = new ConnectionStringBuilder(
				"",
				"",
				"",
				"");

		TopicClient topicClient = new TopicClient(topicConStringBuilder);
		SubscriptionClient subscriptionClient = new SubscriptionClient(subConStringBuilder, ReceiveMode.RECEIVEANDDELETE);

		subscriptionClient.registerMessageHandler(new IMessageHandler() {
			@Override
			public CompletableFuture<Void> onMessageAsync(IMessage message) {
				String body = new String(message.getBody());
				System.out.println("Got message: " + body);
				return CompletableFuture.completedFuture(null);
			}

			@Override
			public void notifyException(Throwable throwable, ExceptionPhase exceptionPhase) {
				System.err.println("Error: " + throwable.getMessage());
			}
		});

		while (true) {
			Thread.sleep(2_000);
			String body = Long.toString(System.currentTimeMillis());
			System.out.println("Sending message: " + body);
			topicClient.send(new Message(body));
		}
	}
}

@yvgopal yvgopal added this to the 1.2.6 milestone Jun 29, 2018
@yvgopal yvgopal closed this as completed Jun 29, 2018
@Preskton
Copy link

Preskton commented Aug 2, 2018

Hi, @yvgopal, per our phone discussion earlier today, the MS team feels this issue was corrected as of 1.2.6. We have been running this version and 1.2.7 and are stiill observing the behaviors described in this thread.

We see similar messages in our logs in apps that are running the new versions, such as:

  • Completing all pending requests with exception in request response link to $cbs
  • Internal receive link of requestresponselink to '$cbs' closed with error.
  • com.microsoft.azure.servicebus.primitives.TimeoutException: Open operation on RequestResponseLink(MessagingFactory4addd6-cbs) on Entity($cbs) timed out at 2018-08-01T20:29:00.961Z[GMT].
  • RequestResponseLink open timed out.

I don't think this is related to long disconnects from the network as all of our consumers are running on Kubernetes in Azure VMs.

Can we please reopen this issue?

@jemag
Copy link

jemag commented Feb 14, 2019

Similar thing happens to me using service bus topic stream binder 1.1.0RC2 (see referenced issue above)

@yvgopal
Copy link
Member

yvgopal commented Feb 14, 2019

@jemag This was an older issue and was fixed in SDK version 1.2.6. I suggest you start using the latest version of java SDK 1.2.11.

@jemag
Copy link

jemag commented Feb 14, 2019

@yvgopal the version I mentionned is for the spring-cloud-azure-servicebus-topic-stream-binder and there doesn't seem to be any newer (than 1.1.0RC2) version available on public repositories.

Do you mean manually replacing the azure-servicebus dependency within the spring-cloud-azure-servicebus-topic-stream-binder?

@yvgopal
Copy link
Member

yvgopal commented Feb 14, 2019

@jemag I don't know how the spring binder package is released. You should ask them to switch to the latest version of service bus SDK. Until then you can manually add maven dependency to the latest version of java SDK.

@jemag
Copy link

jemag commented Feb 14, 2019

kk, thank you. Will do.

@theangrydev
Copy link

I still see this issue when using org.apache.qpid.jms.JmsConnectionFactory in 0.40 of qpid-jms-client to talk to the service bus (not using service bus SDK, just a direct amqps:// connection in the same manner as per the JmsQueueQuickstart.

@yvgopal
Copy link
Member

yvgopal commented Mar 14, 2019

@theangrydev Azure service bus service closes idle links after some timeout. Azure SB java SDK reopens the links. Generic JMS clients may not. You should handle this case and recreate the sender or receiver.

@theangrydev
Copy link

@yvgopal I see, thank you for the information. This is not great behaviour, I have not seen this before in similar services. I will have a look at solving the problem for generic JMS clients and possibly submit an example to the samples project.

@gennadiy-dubina
Copy link

gennadiy-dubina commented Sep 18, 2019

we have the same issue with jms + qpid approach
@tobiasgwaaler @theangrydev have you found any useful solution?

@theangrydev
Copy link

@gennadiy-dubina as @yvgopal pointed out, the Azure SB Java SDK reopens the links but generic JMS clients do not necessarily handle this well if at all.

So your options are:

  1. Use Azure SB Java SDK
  2. Write some custom code around the JMS classes that watches for this issue and reopens the connection. I was going to do some work around this but I did not get round to it unfortunately.

@gennadiy-dubina
Copy link

@theangrydev thank you for response,
i have found one more way - use Spring cloud stream: https://spring.io/projects/spring-cloud-stream

Azure provides binder for it.
https://github.com/microsoft/spring-cloud-azure/tree/master/spring-integration-azure/spring-integration-servicebus

so i'm going to verify this option

@ezienecker
Copy link

@gennadiy-dubina any progress on your research?

@gennadiy-dubina
Copy link

@ezienecker it works for us

@gennadiy-dubina
Copy link

gennadiy-dubina commented May 3, 2020

for JMS client
to avoid issue with producer no need to use CachingConnectionFactory at all
or disable cache for producer - factory.setCacheProducers(false);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests