[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #12278 - added filtering #12279

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

MNPCMW6444
Copy link
@MNPCMW6444 MNPCMW6444 commented Nov 1, 2024

Description

Issue number and link

#12278
issue #12278

Testing plan

Author checklist

  • I have submitted a Contributor License Agreement
  • I have added my name to CONTRIBUTORS.md
  • I have updated CHANGES.md with a short summary of my change
  • I have added or updated unit tests to ensure consistent code coverage
  • I have updated the inline documentation, and included code examples where relevant
  • I have performed a self-review of my code

Copy link
github-actions bot commented Nov 1, 2024

Thank you for the pull request, @MNPCMW6444! Welcome to the Cesium community!

In order for us to review your PR, please complete the following steps:

Review Pull Request Guidelines to make sure your PR gets accepted quickly.

@ggetz
Copy link
Contributor
ggetz commented Nov 7, 2024

Thanks for the PR @MNPCMW6444!

I can confirm we have a CLA on file for you.

Copy link
Contributor
@ggetz ggetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a unit test to verify this fix?

@@ -118,8 +118,74 @@ function addGlyphToTextureAtlas(textureAtlas, id, canvas, glyphTextureInfo) {

const splitter = new GraphemeSplitter();

// Filter to remove unsupported control characters like RLM (\u200f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit about how this list was determined? Would it make sense to test for a unicode range instead of listing each character individually?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my comment please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on using the unicode ranges to make this more concise? I think the following regex matches of these values:

/[\u0000-\u001F\u202a-\u206f\u200b-\u200f]/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

let text = label._renderedText;

// Filter out unsupported control characters
text = filterUnsupportedCharacters(text);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regex operations can be performance intensive. Is there a way we can ensure the operation runs only once for each label text string, as apposed to every time the glyphs need to be rebound?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can implement this with a simple .replace(char, '') instead of a RegExp, will it be acceptable?

Copy link
Contributor
@ggetz ggetz Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using a regex is a good idea, but we should just make sure it only has to happen one time.

Consider moving the filterUnsupportedCharacters to the set function of text in Label.js. We already have a bit of logic there to filter and modify the string before assigning it to _renderedText.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

packages/engine/Source/Scene/LabelCollection.js Outdated Show resolved Hide resolved
Co-authored-by: Gabby Getz <gabby@cesium.com>
@MNPCMW6444
Copy link
Author

Could you please add a unit test to verify this fix?

@ggetz Thanks for your CR!
For the purpose of determing what characters are problematic (turns out not all control chars) and TESTING after the patch I added a full test to sandcastle. but I didn't commit it to PR because I don't think we want it in the sandcastle.
you can see it here - https://github.com/MNPCMW6444/cesium/tree/control-characters-in-labels-causes-the-render-to-stop-%2312278-with-full-test
tell me if you still think I need a unit test or any other comment.

@ggetz
Copy link
Contributor
ggetz commented Nov 15, 2024

Hi @MNPCMW6444,

For the purpose of determing what characters are problematic (turns out not all control chars) and TESTING after the patch I added a full test to sandcastle. but I didn't commit it to PR because I don't think we want it in the sandcastle.

We want to make sure there is some kind of repeatable test in place so we can 1) ensure the code is correct, and 2) keep this bug from happening again if the code ever changes. Most of the time, we use unit tests to validate functionality like this (rather than Sandcastle, as you mentioned) as we can run it as part of our CI process.

See the Testing Guide for more information, but for this particular case, I think we want to:

  1. Architect the code so it can be easily tested: In this case I would move filterUnsupportedCharacters to a static function that can be accessed outside of the LabelCollection.js file. If we move it to Label.js like I suggested above, then this would look like:
/**
 * Removes control characters, which will cause an error when rendering a glyph.
 * @private
 * @param {string} text The original label text
 * @returns {string} The renderable filtered text
 */
Label.filterUnsupportedCharacters = function (text) {
  ...
}
  1. Create unit tests for filterUnsupportedCharacters: Add a new set of unit tests to LabelCollectionSpec.js. The sandcastle code you linked to included some great test cases you should be able to re-use! For example:
it("filterUnsupportedCharacters removes emoji characters from text", function () {
  const text = "a😀b";
  const expectedText = "ab";
  expect(Label.filterUnsupportedCharacters(text)).toEqual(expectedText);
});
  1. Add a label unit test: Finally, we should add one test to LabelCollectionSpec.js to ensure the filtering is happening. Since the tests from step 2 should validate the details of the filterUnsupportedCharacters function, this test can be relatively simple, i.e.:
it("filters unsupported characters from label text", function () {
  const text = "my\u200btest\u001Dstring😀";
  const expectedText = "myteststring";
  const label = labels.add({
    text: text,
  });

  expect(label.text).toEqual(text);
  expect(label._renderedText).toEqual(expectedText);
});

What do you think? Could you give it a try?

@MNPCMW6444
Copy link
Author

so i will make the regex simpler with a unicode range and apply it in Label.js, anc create a unit test
one thing im not sure, @ggetz if its a private static method Label.filterUnsupportedCharacters = function (text) { can the set of text access it? how?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants