This is what I expected to happen:
When I press stop the machine should come to a stop as quickly as practical then stop the spindle.
This is what happened instead:
This issue is intermittent but sometimes when using stop to stop a running program the machine stops the spindle and moves to some seemingly random point. I have seen two different symptoms. It either moves at around feed rate to a point then stops or it moves slowly and just keeps going until it hits a limit or you hit emergency stop. On my router this mainly seems to happen on X and Y. On my mill X,Y,Z all move. Every time I have seen this the axes have moved in a negative direction.
If you use feed hold then stop it works correctly every time. Only stopping while moving shows the problem.
The attached code triggers the fault every time on my mill. Run it until at least line 700 then hit stop. The exact line number is not critical. Code size seems to have an effect. I tried shortening the code to make testing easier but the fault went away.
I am not sure when this started but it has been around for at least 6 months.
LinuxCNC latest buildbot, RT-Preempt, using Axis. Note using a hardware stop button or the GUI have the same effect.
Both machines use Mesa Ethernet Anything IO cards
problem.txt
评论 (17)
#2 – LesNewell-SheetCam 于 2019-04-23
Sorry, I forgot to mention these are both servo systems. This is a commanded move. You can see the axes move in Axis and the run button is disabled until it stops moving.
#3 – SebKuzminsky 于 2019-04-23
My first guess is faulty hardware. Try running memtest86 overnight on the machine that displays the problem.
#4 – LesNewell-SheetCam 于 2019-04-23
This fault shows up on two different machines with different make computers. The only hardware similarity is that both run Mesa Anything IO, which I doubt are the problem.
#5 – andypugh 于 2019-04-23
Which Mesa cards?
#6 – LesNewell-SheetCam 于 2019-04-23
7I80HD-16 + 7I48
I ran some more tests on the mill with a different file and the length definitely has an effect. At 302k bytes long it works correctly every time and at 306k bytes it fails every time I hit stop. This file was originally much longer and cut without any problems, though I kept well away from the stop while it was actually cutting.
Watching it while testing file sizes, the stop move seemed to be about the same length each time. I have never seen this exact symptom on the router, but I also don’t run big files on it. On the router it moves very slowly and just keeps going. It is also very intermittent. I can’t repeat it like on the mill.
#7 – andypugh 于 2019-04-23
It might be a hm2_eth problem then.
7i48 implies analogue servo control?
Can you answer the question about whether it is a commanded move? Does the DRO change?
#8 – cradek 于 2019-04-23
Are you using remap? It might help if you would share your whole machine config directory (just tar it up).
#9 – LesNewell-SheetCam 于 2019-04-23
Andy, it is a commanded move. You can see the axes move in Axis and the run button is disabled until it stops moving.
Cradek, I have remapped M6. Commenting out the remap made no difference.
Here is my config.
config.tar.gz
I use some custom components so I included them as well.
#10 – LesNewell-SheetCam 于 2019-04-23
I found the trigger. If I use ONABORTCOMMAND to call a subroutine:
ONABORTCOMMAND=O
even an empty sub will make it fail. I got the file down to this and it still failed:
o
o
m2
If I comment out ONABORTCOMMAND I can’t get the fault to happen.
#11 – andypugh 于 2019-04-23
Hmm…
https://github.com/LinuxCNC/linuxcnc/issues/241
And
https://github.com/LinuxCNC/linuxcnc/issues/393
#12 – andypugh 于 2019-04-23
And https://sourceforge.net/p/emc/mailman/message/35669951/
It seems that this might have been fixed once?
#13 – LesNewell-SheetCam 于 2019-04-24
Yup, that fits. I’ll disable on_abort for both machines. It isn’t a major deal for the mill but aborting while changing tools can leave the router in a potentially dangerous state as tool change changes the soft limits among other things. I guess the risk of doing damage by accidentally jogging into the tool changer is less than the risk of a potential runaway. So far I’ve only trashed a big block of urethane foam but the mill could have ploughed straight into the table.
#14 – andypugh 于 2019-04-24
We definitely need to do something about this, though, even if it is completely removing the ONABORTCOMMAND function.
#15 – gmoccapy 于 2019-08-07
Pretty sure that this one is related to Issue 588 (Que buster not updated)
Norbert
#16 – zultron 于 2020-06-08
This sure does sound related to #865 and #882, where readahead segments aren’t dropped during task abort, and get flushed into the queue by commands following the abort: the ONABORTCOMMAND in this case, and the restored state tag in the case of #865.
However, in my quick test, I couldn’t tickle the bug in the 2.8 branch by adding the ONABORTCOMMAND from @sheetcam’s config. Anyway, if it is the same problem, hopefully #882 will fix it.
#17 – zultron 于 2020-06-08
…and I can see how @mozmck ‘s commit b353bd30 might fix some cases, too. I admit I’m not terribly clear about why there are so many places with similar but not identical code used to abort.
#1 – andypugh 于 2019-04-23
That sounds bad!
Have you been able to determine if this is a commanded move from LinuxCNC or a problem with the external hardware?
Big question: Step/dir or something else?
Does the position display in the GUI change? And are you displaying commanded or feedback?