[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FFI] Allow to pass CData into struct and/or union fields. #11934

Closed
SerafimArts opened this issue Aug 10, 2023 · 17 comments
Closed

[FFI] Allow to pass CData into struct and/or union fields. #11934

SerafimArts opened this issue Aug 10, 2023 · 17 comments

Comments

@SerafimArts
Copy link
Contributor
SerafimArts commented Aug 10, 2023

Feature/Bug

The following code not working now:

<?php

// Boxed scalar
$uint32 = FFI::new('uint32_t');
$uint32->cdata = 0xDEAD_BEEF;

// Struct
$ffi = FFI::new('struct { uint32_t field; }');
$ffi->field = $uint32;

Or more real world usage:

<?php

$ffi = FFI::new('struct { uint64_t field; }');

$value = FFI::new('struct __attribute__((__packed__)) { uint32_t lo; uint32_t hi; }');
$value->lo = 0x42;        // First 32 bytes
$value->hi = 0xDEAD_BEEF; // Second 32 bytes

// Because PHP (x86) does not support uint64.
// On 64-bit PHP, this can be done with binary conversions (int64 -> uint64), 
// but this is not so convenient.
$ffi->field = FFI::cast('uint64_t', FFI::addr($value));

Resulted in this output:

PHP Warning: Object of class FFI\CData could not be converted to int

I think it makes sense to do automatic boxing/unboxing to int, at least in the same way as it is done with GMP. However, I still have no idea how to implement this without breaking backwards compatibility =\\

@KapitanOczywisty
Copy link
KapitanOczywisty commented Aug 10, 2023

I think code responsible for that is here: ffi.c#L734. Though, I'm not sure how viable is adding CData.

However you can do some trickery to have similar effect:

<?php

// HackA
$ffi = FFI::new('struct { uint64_t field; }');

$hack = FFI::cast('uint32_t*', FFI::addr($ffi));
$hack[1] = 0x42;        // First 32 bytes
$hack[0] = 0xDEAD_BEEF; // Second 32 bytes

var_dump(dechex($ffi->field)); // x64 only
var_dump(bin2hex(FFI::String($ffi->field, 8)));

// HackB
$ffi = FFI::new('struct { uint64_t field; }');
FFI::memcpy($ffi, strrev(hex2bin('00000042DEADBEEF')), 8);

var_dump(bin2hex(FFI::String($ffi->field, 8)));

// HackC
$ffi = FFI::new('struct { uint64_t field; }');
$hack = FFI::cast('struct { uint32_t lo; uint32_t hi; }', $ffi);
$hack->hi = 0x42;        // First 32 bytes
$hack->lo = 0xDEAD_BEEF; // Second 32 bytes

var_dump(bin2hex(FFI::String($ffi->field, 8)));

Also note that calling some methods statically will be deprecated in PHP 8.3: https://wiki.php.net/rfc/deprecate_functions_with_overloaded_signatures#fficast_ffinew_and_ffitype

@SerafimArts
Copy link
Contributor Author
SerafimArts commented Aug 10, 2023

Also note that calling some methods statically will be deprecated in PHP 8.3

Yes, I know, but I used a more readable and convenient option as an example.


The point here is not that it is impossible to fix this problem. But rather, that you have to fence a lot of non-obvious code (overengineer) for this.

P.S. there is another hack through set by pointer, but I could not check the correctness of this option:

$struct = FFI::new('struct { uint64_t value; }');

FFI::memcpy(FFI::addr($struct->value), pack('LL', 2**32-1, 42), 8);

@KapitanOczywisty
Copy link

But rather, that you have to fence a lot of non-obvious code (overengineer) for this.

Assigning CData would be basically memcpy, just without size argument and I'm not entirely sure this can be done cleanly, but I'll take a look.

@nielsdos
Copy link
Member

Ah, I was also looking at it and only now see your message. A simple PoC would be:

diff --git a/ext/ffi/ffi.c b/ext/ffi/ffi.c
index f2d03699ba..50ca250a28 100644
--- a/ext/ffi/ffi.c
+++ b/ext/ffi/ffi.c
@@ -771,10 +771,24 @@ static zend_always_inline zend_result zend_ffi_zval_to_cdata(void *ptr, zend_ffi
 			lval = zval_get_long(value);
 			*(int16_t*)ptr = lval;
 			break;
-		case ZEND_FFI_TYPE_UINT32:
-			lval = zval_get_long(value);
+		case ZEND_FFI_TYPE_UINT32: {
+			bool failed;
+			lval = zval_try_get_long(value, &failed);
+			if (failed) {
+				if (Z_TYPE_P(value) == IS_OBJECT && Z_OBJCE_P(value) == zend_ffi_cdata_ce) {
+					zend_ffi_cdata *cdata = (zend_ffi_cdata*)Z_OBJ_P(value);
+					if (zend_ffi_is_compatible_type(type, ZEND_FFI_TYPE(cdata->type)) &&
+						type->size == ZEND_FFI_TYPE(cdata->type)->size) {
+						memcpy(ptr, cdata->ptr, type->size);
+						return SUCCESS;
+					}
+				}
+				zend_ffi_assign_incompatible(value, type);
+				return FAILURE;
+			}
 			*(uint32_t*)ptr = lval;
 			break;
+		}
 		case ZEND_FFI_TYPE_SINT32:
 			lval = zval_get_long(value);
 			*(int32_t*)ptr = lval;

i.e. using the same strategy as the default case, and using a fallible zval_try_get_long call. Ofc, this needs to be applied to the other cases too, and that code would have to be deduplicated.

@KapitanOczywisty
Copy link

I was thinking about using zend_ffi_is_compatible_type and if types are compatible copying memory (with maybe some direct assignments for short types). I think changing every case is just wasteful.

@nielsdos
Copy link
Member
nielsdos commented Aug 10, 2023

Depends on the interpretation of "changing every case" ig :P
Only annoying thing with the copies instead of direct assignments is endianness (for things like *(uint32_t*)ptr = lval etc).

@KapitanOczywisty
Copy link
KapitanOczywisty commented Aug 10, 2023

Shouldn't endianness be the same for types with the same sizes? I'm assuming you cannot assign CData(char) to uint32_t, and if you try to assign to/from struct or array, I'm expecting same byte order anyway (not sure if this is allowed in that function). I think FFI::Cast does something similar, but instead pointer it'd be a memcpy.

@nielsdos
Copy link
Member

Shouldn't endianness be the same for types with the same sizes?

It is, but what I mean by that is when the cases are generalised into a single one, then we'll have to convert *(whatever_type*)ptr = lval to a memcpy for convenience for the IS_LONG zval cases. This is where the endianness difference plays a role.

@SerafimArts
Copy link
Contributor Author
SerafimArts commented Aug 10, 2023

From the point of view of BC: Maybe it makes sense to introduce the FFI::get() and FFI::set() methods for explicitly setting and getting a value in the form of a CData?

$struct = FFI::cdef('struct { int value; }');

$struct->value; // int(0)
FFI::get($string->value); // object(CData) { cdata: int(0) }

// And
$struct->value = 42;
FFI::set($string->value, 42); // Error: CData expected
FFI::set($string->value, FFI::new('int')); // Ok

P.S. And it can do the same with functions.

char* example(void);

///
$ffi->example(); // string(4) "test"
FFI::get($ffi->example()); // object(CData) { [0] => 't' }

...although I don't remember exactly in which cases the return results of functions are converted to PHP types, instead of CData

@nielsdos
Copy link
Member

I'm not sure what the best way to approach the BC is. I guess the BC break isn't that bad because we're replacing a warning of something that didn't work with something that works?
But I'm not a user of FFI myself, so I can't properly judge the BC.

@KapitanOczywisty
Copy link

we'll have to convert *(whatever_type*)ptr = lval to a memcpy for convenience for the IS_LONG zval cases.

I'm not sure what you mean, generalized would be only for CData, and this could be even limited to the same size CData.

I guess the BC break isn't that bad because we're replacing a warning of something that didn't work with something that works?

CData wasn't supposed to be supported at all for fields, so "none". Existing assignments would remain the same.

@nielsdos
Copy link
Member

I'm not sure what you mean, generalized would be only for CData, and this could be even limited to the same size CData.

I feel like we actually want to approach this the same way, but we're having some miscommunication.
I think I get now what you mean by I think changing every case is just wasteful. I interpreted this as you wanting to have fallthrough cases to handle the assignment, but I now think you meant checking for this case at the very start of the function.

In any case, feel free to work on this if you want, I don't want to steal this from you :)

@KapitanOczywisty
Copy link

Yes, exactly, sorry about that.

In any case, feel free to work on this if you want, I don't want to steal this from you :)

I don't mind :D I've assumed nobody really wants to touch FFI. I'm really impressed how many things you're doing here!

nielsdos added a commit to nielsdos/php-src that referenced this issue Aug 10, 2023
…ields

Implementation based on a discussion with KapitanOczywisty.
@nielsdos
Copy link
Member

Yes, exactly, sorry about that.

No worries, glad we sorted this out.

In any case, feel free to work on this if you want, I don't want to steal this from you :)

I don't mind :D I've assumed nobody really wants to touch FFI. I'm really impressed how many things you're doing here!

:D I've filed #11935 that implements this. It's too late for 8.3 because we're past the feature freeze, but better late than never.
Hopefully I didn't overlook anything.

@SerafimArts
Copy link
Contributor Author
SerafimArts commented Aug 10, 2023

It's too late for 8.3 because we're past the feature freeze, but better late than never.

Hmm, I think FFI is not popular enough to take this into account: https://packagist.org/?query=ffi&tags=ffi Almost all popular packages are written by me.

I think it makes sense to ask release managers (like Pierrick Charron. Alas, I do not know his nickname on the github to ping) if this is it possible to add this feature/fix.

@Girgias
Copy link
Member
Girgias commented Aug 10, 2023

The RMs are @adoy, @bukka, and @ericmann if needed.

nielsdos added a commit that referenced this issue Aug 29, 2023
Co-authored-by: KapitanOczywisty <44417092+KapitanOczywisty@users.noreply.github.com>

Closes GH-11935.
@nielsdos
Copy link
Member

Merged for 8.3 just in time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants