[LinuxCNC/linuxcnc Issue#588] return from remap does not bust queue About GRBL HAL

Issue #588 | 状态: 进行中 | 作者: rene-dev | 创建时间: 2019-05-12

标签: interpreter remap

Here are the steps I follow to reproduce the issue:

1.
after restore in line 62 in configs/sim/axis/remap/manual-toolchange-with-tool-length-switch/ncsubroutines/manualchange.ngc in add any gcode, for example g0 x0
2.
run configs/sim/axis/remap/manual-toolchange-with-tool-length-switch/manualtoolchange.ini
3.
enable, home, and run program
4.
click change complete when it waits for the toolchange.
Do not press simulate contact, wait for it to fail with -2
5.
g0 x0 is not executed.
inside the if, only the return is executed.
6.
code is executed, if you add m66 p0 l0 a queue buster before the return.

without this, it is not possible to recover from toolchanger failures, or put the machine in a safe state after a toolchange failure.

This is what I expected to happen:

code to be run before return.

This is what happened instead:

code not executed, return returns during readahead.

It worked properly before this:

I dont think this ever worked. I tried 2.7 and 2.8-pre

Information about my hardware and software:

2.8-pre

#1 – andypugh 于 2019-05-13

I think this also causes #579

#2 – rene-dev 于 2019-05-13

I dont think its related, it also happens without ONABORTCOMMAND.

#3 – rene-dev 于 2019-05-16

https://github.com/LinuxCNC/linuxcnc/blob/master/src/emc/rs274ngc/interp_python.cc#L21

#4 – zultron 于 2019-05-20

I ran your procedure to reproduce this error. Thanks for providing the clear steps: with the ‘click change complete’ and other details, I was able to run it with DEBUG=2147483647 and see what was happening pretty quickly.

“Interp::read:| G0 X10| [...] emcTaskPlanCommand( G0 X10) called. (line_number=63) execute:auto line=' G0 X10' mdiint=0 otype=Onone oname=(null) cl=1 rl=1 type=unset state=CS_NORMAL NMLINTERPLIST(0x5624cf2c7a00)::append(nmlmsgptr{size=128,type=EMCTRAJLINEARMOVE}) : listsize=1, line_number=63`TheG0 X10 command went onto the queue. So far, so good.

`[...] Interp::read:|O300 return [-2] ; indicate probe contact failure to epilog| In: read_o line:-1 |o300return[-2]; indicate probe contact failure to epilog| subroutine=|300| global case:|300| otype:Oreturn o_name: 300 line:-1 o300return[-2]; indicate probe contact failure to epilog return 300 value -2.000000 [...] emcTaskPlanCommand(O300 return [-2] ; indicate probe contact failure to epilog) called. (line_number=64) execute:auto line='O300 return [-2] ; indicate probe contact failure to epilog' mdiint=0 otype=Oreturn oname=300 cl=1 rl=1 type=unset state=CS_NORMAL convertcontrolfunctions O_return executereturn manualchange type=CTREMAP state=CSNORMAL pycall(remap.change_epilog)


unwindcall: calllevel=1 status=INTERPERROR - error: 5 from execute emc/rs274ngc/rs274ngcpre.cc:523

unwind_call leaving sub ''

unwindcall: reopening './ncfiles/tcdemo.ngc' at 10

unwind_call: setting sequence number=0 from frame 1

unwindcall: exiting current sub 'manualchange'

emc/task/emctask.cc 405: interp_error: M6 aborted (return code -2.0) M6 aborted (return code -2.0) Interpreter stack: - Python - int Interp::handlerreturned(setuppointer, contextpointer, const char, bool) - int Interp::executereturn(setuppointer, contextpointer, int) - int Interp::convertcontrolfunctions(blockpointer, setuppointer) - int Interp::_execute(const char) [...]`

The O300 return [-2] line (should really be o return [-2]) returns from the subroutine signaling an error (negative return code), and interp aborts.

As part of the abort, the interp list is cleared, explaining why the G0 X10 command disappeared.
This actually all looks exactly as we would expect when a normal, non-remapped block returns an error: the error triggers an abort, which for safety stops motion and clears the interp queue (including any queued motion commands).
I think this issue is about the counter-intuitive behavior resulting from the M6 NGC subroutine returning a negative result to signal an error in the remapped block. We expect the subroutine to execute everything queued up before the end of the sub, and only abort anything following the Q300 return. Instead, the abort happens only after the last queue buster, which can be many blocks before the return [-2] block.
As you show, adding the M66 block before the return [-2] is a good workaround.
A proper fix in the remap code would catch the negative return code coming out of a failed remap and bust the queue before returning the error. I think this shouldn't happen after a successful M6 remap, but there could be an argument for that.
Look at ncfiles/remaplib/python-stdglue/stdglue.py where changeepilog() reports the error and yields INTERPERROR around line 198. It might be possible to add a yield INTERPEXECUTEFINISH` before that. A few years ago I fixed a problem where python remaps barfed after yielding too many times, but I’m not sure if it has been very rigorously exercised, so YMMV.

It would be great to have a PR with an accompanying test.

#5 – zultron 于 2020-06-08

Looking at this again, it seems the target for the queue buster shouldn’t be ALL remap commands, just the M6. Hack in special logic here?

https://github.com/LinuxCNC/linuxcnc/blob/2.8/src/emc/rs274ngc/interp_convert.cc#L3079-L3091

原始Issue: https://github.com/LinuxCNC/linuxcnc/issues/588

喜欢 (0)

Here are the steps I follow to reproduce the issue:

This is what I expected to happen:

This is what happened instead:

It worked properly before this:

Information about my hardware and software:

评论 (5)